有时为了信息保密或是单纯阅读代码,我们需要删除注释。
之前考虑过正则表达式,但是感觉实现起来相当麻烦。而状态机可以把多种情况归为一类状态再行分解,大大简化问题。本文就是基于状态机实现的。
- 删除C/C++代码注释
- 删除Java代码注释
- 程序
- 参考
这里的内容参考了博客http://www.cnblogs.com/zhanghaiba/p/3569928.html#3853787,写得很赞。
本文基于上面所述博文进行了以下修改或是优化:
其中,除状态NOTE_MULTILINE_STAR外,其余状态下均需进行字符(串)处理,以保持正确输出。详见文末代码。
可以看到,java中的注释规则更为简单,其中/** */完全可以用/* */的状态涵盖。且不会出现折行注释和字符串折行的情况,因此状态更加简单,有兴趣的可以画一画,这里就不画图了。换句话说,上面删除C/C++注释的程序完全可以用来删除java注释。
- import java.io.FileInputStream;
- import java.io.InputStreamReader;
- import java.io.BufferedReader;
- import java.io.FileOutputStream;
- import java.io.OutputStreamWriter;
- import java.io.BufferedWriter;
- import java.io.IOException;
- import java.util.Scanner;
- /**
- * @author xiaoxi666
- * @version 1.0.0 2017.12.01
- */
- public class deleteCAndCplusplusAndJavaNote {
- /**
- * 状态
- */
- enum State {
- CODE,
- // 正常代码
- SLASH,
- // 斜杠
- NOTE_MULTILINE,
- // 多行注释
- NOTE_MULTILINE_STAR,
- // 多行注释遇到*
- NOTE_SINGLELINE,
- // 单行注释
- BACKSLASH,
- // 折行注释
- CODE_CHAR,
- // 字符
- CHAR_ESCAPE_SEQUENCE,
- // 字符中的转义字符
- CODE_STRING,
- // 字符串
- STRING_ESCAPE_SEQUENCE // 字符串中的转义字符
- };
- /**
- * @function 删除代码中的注释,以String形式返回
- * @param strToHandle 待删除注释的代码
- * @return 已删除注释的代码,String字符串形式
- */
- public static String delete_C_Cplusplus_Java_Note(String strToHandle) {
- StringBuilder builder = new StringBuilder();
- State state = State.CODE; // Initiate
- for (int i = 0; i < strToHandle.length(); ++i) {
- char c = strToHandle.charAt(i);
- switch (state) {
- case CODE:
- if (c == '/') {
- state = State.SLASH;
- } else {
- builder.append(c);
- if (c == '\'') {
- state = State.CODE_CHAR;
- } else if (c == '\"') {
- state = State.CODE_STRING;
- }
- }
- break;
- case SLASH:
- if (c == '*') {
- state = State.NOTE_MULTILINE;
- } else if (c == '/') {
- state = State.NOTE_SINGLELINE;
- } else {
- builder.append('/');
- builder.append(c);
- state = State.CODE;
- }
- break;
- case NOTE_MULTILINE:
- if (c == '*') {
- state = State.NOTE_MULTILINE_STAR;
- } else {
- if (c == '\n') {
- builder.append("\r\n"); //保留空行,当然,也可以去掉
- }
- state = State.NOTE_MULTILINE; //保持当前状态
- }
- break;
- case NOTE_MULTILINE_STAR:
- if (c == '/') {
- state = State.CODE;
- } else if (c == '*') {
- state = State.NOTE_MULTILINE_STAR; //保持当前状态
- } else {
- state = State.NOTE_MULTILINE;
- }
- break;
- case NOTE_SINGLELINE:
- if (c == '\\') {
- state = State.BACKSLASH;
- } else if (c == '\n') {
- builder.append("\r\n");
- state = State.CODE;
- } else {
- state = State.NOTE_SINGLELINE; //保持当前状态
- }
- break;
- case BACKSLASH:
- if (c == '\\' || c == '\r' || c == '\n') { //windows系统换行符为\r\n
- if (c == '\n') {
- builder.append("\r\n"); //保留空行,当然,也可以去掉
- }
- state = State.BACKSLASH; //保持当前状态
- } else {
- state = State.NOTE_SINGLELINE;
- }
- break;
- case CODE_CHAR:
- builder.append(c);
- if (c == '\\') {
- state = State.CHAR_ESCAPE_SEQUENCE;
- } else if (c == '\'') {
- state = State.CODE;
- } else {
- state = State.CODE_CHAR; //保持当前状态
- }
- break;
- case CHAR_ESCAPE_SEQUENCE:
- builder.append(c);
- state = State.CODE_CHAR;
- break;
- case CODE_STRING:
- builder.append(c);
- if (c == '\\') {
- state = State.STRING_ESCAPE_SEQUENCE;
- } else if (c == '\"') {
- state = State.CODE;
- } else {
- state = State.CODE_STRING; //保持当前状态
- }
- break;
- case STRING_ESCAPE_SEQUENCE:
- builder.append(c);
- state = State.CODE_STRING;
- break;
- default:
- break;
- }
- }
- return builder.toString();
- }
- /**
- * @function 从指定文件中读取代码内容,以String形式返回
- * @param inputFileName 待删除注释的文件
- * @return 待删除注释的文件中的代码内容,String字符串形式
- * @note 输入文件格式默认为 UTF-8
- */
- public static String readFile(String inputFileName) {
- StringBuilder builder = new StringBuilder();
- try {
- FileInputStream fis = new FileInputStream(inputFileName);
- InputStreamReader dis = new InputStreamReader(fis);
- BufferedReader reader = new BufferedReader(dis);
- String s;
- // 每次读取一行,当改行为空时结束
- while ((s = reader.readLine()) != null) {
- builder.append(s);
- builder.append("\r\n"); // windows系统换行符
- }
- dis.close();
- } catch(IOException e) {
- e.printStackTrace();
- System.exit(1);
- }
- return builder.toString();
- }
- /**
- * @function 将删除注释后的代码保存到指定新文件
- * @param outputFileName 保存“删除注释后的代码”的文件的文件名
- * @param strHandled 删除注释后的代码
- */
- public static void writeFile(String outputFileName, String strHandled) {
- try {
- FileOutputStream fos = new FileOutputStream(outputFileName);
- OutputStreamWriter dos = new OutputStreamWriter(fos);
- BufferedWriter writer = new BufferedWriter(dos);
- writer.write(strHandled);
- writer.close();
- System.out.println("Code that without note has been saved successfully in " + outputFileName);
- } catch(IOException e) {
- e.printStackTrace();
- }
- }
- /**
- * @function 读取待处理文件,删除注释,处理过的代码写入新文件
- * @param args
- */
- public static void main(String[] args) {
- Scanner in =new Scanner(System. in );
- //待删除注释的文件
- System.out.println("The fileName that will be delete note:");
- String inputFileName = in.nextLine();
- //保存“删除注释后的代码”的文件
- System.out.println("The fileName that will save code without note:");
- String outputFileName = in.nextLine();
- String strToHandle = readFile(inputFileName);
- String strHandled = delete_C_Cplusplus_Java_Note(strToHandle);
- writeFile(outputFileName, strHandled);
- }
- }
- 怎样删除C/C++代码中的所有注释?浅谈状态机的编程思想:http://www.cnblogs.com/zhanghaiba/p/3569928.html#3853787
- 谁能写出个删除注释的正则表达式:http://bbs.csdn.net/topics/380183706
- 正则表达式删除代码的注释:http://blog.csdn.net/conquer0715/article/details/14446463
来源: http://www.cnblogs.com/xiaoxi666/p/7931763.html