这里有新鲜出炉的Java并发编程示例,程序狗速度看过来!
java 是一种可以撰写跨平台应用软件的面向对象的程序设计语言,是由Sun Microsystems公司于1995年5月推出的Java程序设计语言和Java平台(即JavaEE(j2ee), JavaME(j2me), JavaSE(j2se))的总称。
这篇文章主要介绍了java实现求两个字符串最大公共子串的方法,详细的描述了两个字符串的最大公共子串算法的实现,需要的朋友可以参考下
本文实例讲述了java实现求两个字符串最大公共子串的方法。分享给大家供大家参考,具体如下:
最近在项目工作中有一个关于文本对比的需求,经过这段时间的学习,总结了这篇博客内容:求两个字符串的最大公共子串。
算法思想:基于图计算两字符串的公共子串。具体算法思想参照下图:
输入字符串S1:achmacmh 输入字符串S2:macham
具体的实现代码如下所示:
- package cn.lulei.compare;
- import java.util.ArrayList;
- import java.util.Collections;
- import java.util.Comparator;
- import java.util.List;
- public class StringCompare {
- private int a;
- private int b;
- public String getMaxLengthCommonString(String s1, String s2) {
- if (s1 == null || s2 == null) {
- return null;
- }
- a = s1.length(); //s1长度做行
- b = s2.length(); //s2长度做列
- if (a == 0 || b == 0) {
- return "";
- }
- //设置匹配矩阵
- boolean[][] array = new boolean[a][b];
- for (int i = 0; i < a; i++) {
- char c1 = s1.charAt(i);
- for (int j = 0; j < b; j++) {
- char c2 = s2.charAt(j);
- if (c1 == c2) {
- array[i][j] = true;
- } else {
- array[i][j] = false;
- }
- }
- }
- //求所有公因子字符串,保存信息为相对第二个字符串的起始位置和长度
- List < ChildString > childStrings = new ArrayList < ChildString > ();
- for (int i = 0; i < a; i++) {
- getMaxSort(i, 0, array, childStrings);
- }
- for (int i = 1; i < b; i++) {
- getMaxSort(0, i, array, childStrings);
- }
- //排序
- sort(childStrings);
- if (childStrings.size() < 1) {
- return "";
- }
- //返回最大公因子字符串
- int max = childStrings.get(0).maxLength;
- StringBuffer sb = new StringBuffer();
- for (ChildString s: childStrings) {
- if (max != s.maxLength) {
- break;
- }
- sb.append(s2.substring(s.maxStart, s.maxStart + s.maxLength));
- sb.append("\n");
- }
- return sb.toString();
- }
- //排序,倒叙
- private void sort(List < ChildString > list) {
- Collections.sort(list, new Comparator < ChildString > () {
- public int compare(ChildString o1, ChildString o2) {
- return o2.maxLength - o1.maxLength;
- }
- });
- }
- //求一条斜线上的公因子字符串
- private void getMaxSort(int i, int j, boolean[][] array, List < ChildString > sortBean) {
- int length = 0;
- int start = j;
- for (; i < a && j < b; i++, j++) {
- if (array[i][j]) {
- length++;
- } else {
- sortBean.add(new ChildString(length, start));
- length = 0;
- start = j + 1;
- }
- if (i == a - 1 || j == b - 1) {
- sortBean.add(new ChildString(length, start));
- }
- }
- }
- //公因子类
- class ChildString {
- int maxLength;
- int maxStart;
- ChildString(int maxLength, int maxStart) {
- this.maxLength = maxLength;
- this.maxStart = maxStart;
- }
- }
- /**
- * @param args
- */
- public static void main(String[] args) {
- // TODO Auto-generated method stub
- System.out.println(new StringCompare().getMaxLengthCommonString("achmacmh", "macham"));
- }
- }
程序最终执行结果是:
对于两个文件的比对个人认为可以参照这种算法思想(自己现在并为实现),在日后的博客中将会写到。
上述实现过程中,用数组保存了所有的公共子串信息,然后排序取最大的子串,这种做法如果只是求最大子串的话,算法就不是很合理,因此做了如下修改,List只保存当前计算中最大的子串,具体实现如下:
- /**
- *@Description: 字符串比较
- */
- package com.lulei.test;
- import java.util.ArrayList;
- import java.util.List;
- public class StringCompare {
- private int a;
- private int b;
- private int maxLength = -1;
- public String getMaxLengthCommonString(String s1, String s2) {
- if (s1 == null || s2 == null) {
- return null;
- }
- a = s1.length(); //s1长度做行
- b = s2.length(); //s2长度做列
- if (a == 0 || b == 0) {
- return "";
- }
- //设置匹配矩阵
- boolean[][] array = new boolean[a][b];
- for (int i = 0; i < a; i++) {
- char c1 = s1.charAt(i);
- for (int j = 0; j < b; j++) {
- char c2 = s2.charAt(j);
- if (c1 == c2) {
- array[i][j] = true;
- } else {
- array[i][j] = false;
- }
- }
- }
- //求所有公因子字符串,保存信息为相对第二个字符串的起始位置和长度
- List < ChildString > childStrings = new ArrayList < ChildString > ();
- for (int i = 0; i < a; i++) {
- getMaxSort(i, 0, array, childStrings);
- }
- for (int i = 1; i < b; i++) {
- getMaxSort(0, i, array, childStrings);
- }
- StringBuffer sb = new StringBuffer();
- for (ChildString s: childStrings) {
- sb.append(s2.substring(s.maxStart, s.maxStart + s.maxLength));
- sb.append("\n");
- }
- return sb.toString();
- }
- //求一条斜线上的公因子字符串
- private void getMaxSort(int i, int j, boolean[][] array, List < ChildString > sortBean) {
- int length = 0;
- int start = j;
- for (; i < a && j < b; i++, j++) {
- if (array[i][j]) {
- length++;
- } else {
- //直接add,保存所有子串,下面的判断,只保存当前最大的子串
- //sortBean.add(new ChildString(length, start));
- if (length == maxLength) {
- sortBean.add(new ChildString(length, start));
- } else if (length > maxLength) {
- sortBean.clear();
- maxLength = length;
- sortBean.add(new ChildString(length, start));
- }
- length = 0;
- start = j + 1;
- }
- if (i == a - 1 || j == b - 1) {
- //直接add,保存所有子串,下面的判断,只保存当前最大的子串
- //sortBean.add(new ChildString(length, start));
- if (length == maxLength) {
- sortBean.add(new ChildString(length, start));
- } else if (length > maxLength) {
- sortBean.clear();
- maxLength = length;
- sortBean.add(new ChildString(length, start));
- }
- }
- }
- }
- //公因子类
- class ChildString {
- int maxLength;
- int maxStart;
- ChildString(int maxLength, int maxStart) {
- this.maxLength = maxLength;
- this.maxStart = maxStart;
- }
- }
- /**
- * @param args
- */
- public static void main(String[] args) {
- // TODO Auto-generated method stub
- System.out.println(new StringCompare().getMaxLengthCommonString("abcdef", "defabc"));
- }
- }
来源: http://www.phperz.com/article/17/1202/359640.html