Python使用difflib对比两个文件操作实例（python如何对比两个文件内容不同）

25-02-19 15

此处将为大家介绍关于Python使用difflib对比两个文件操作实例的详细内容，并且为您解答有关python如何对比两个文件内容不同的相关问题，此外，我们还将为您介绍关于linux下对比两个文件夹下

此处将为大家介绍关于Python使用difflib对比两个文件操作实例的详细内容，并且为您解答有关python如何对比两个文件内容不同的相关问题，此外，我们还将为您介绍关于linux下对比两个文件夹下python文件的差异、linux对比两个文件的差异、python difflib模块示例讲解、python difflib模块详解的有用信息。

本文目录一览：

Python使用difflib对比两个文件操作实例（python如何对比两个文件内容不同）
linux下对比两个文件夹下python文件的差异
linux对比两个文件的差异
python difflib模块示例讲解
python difflib模块详解

Python使用difflib对比两个文件操作实例（python如何对比两个文件内容不同）

#coding=utf8
''''''
该库用来管理文件。
初始化函数调用读取配置文件模块中的Config类
用来获取下载路径、保存路径。
模块包含四个方法：
clearResultCSV(): 用来删除下载路径下所有的result开头的csv文件
moveCSVToSave():把下载路径下的result.csv文件重命名，并把重命名后的文件移动到保存路径下
getLastFileWithPath():获取保存路径下最新的文件，并带路径返回该文件
getLastFile(）：获得最新文件的命令并返回

''''''
import os
#操作文件的包
import shutil
import re
import time
import difflib
#导入读取配置文件库的Config
from readConfig import Config

class FileManger(object):
    def __init__(self,configObj=Config()):
        try:
            #创建一个Config对象实例
            self.config=configObj
            #通过对象实例调用方法getDownPath()
            #获取下载路径
            self.down=self.config.getDownPath()
            #通过对象实例调用方法getSavePath()
            #获取保存路径
            try:
                self.save=self.config.getSavePath()
                if os.path.exists(self.save):
                    pass
                else:
                    os.mkdir(self.save)
                    self.save=self.save
            except Exception,e:
                print "Save Report Error:",e
        except Exception,e:
            print e
      
            
    def clearResultCSV(self):
        try:
            #获取下载路径下的所有文件
            #并把文件保存在list变量fileList中
            fileList=os.listdir(self.down)
            #判断fileList是否为空，不为空执行if模块
            if  fileList:
                #对fileList中的元素进行循环
                for item in fileList:
                    #查找下载路径下是否存在result开头的csv文件
                    #如果存在，则删除
                    if re.match("result(.*).csv",item):
                        #删除result开头的csv文件
                        os.remove(self.down+"\\"+item)
        except Exception,e:
            print e   
    
    def moveCSVToSave(self):
        try:
            #获取下载路径下的所有文件
            #并把文件保存在list变量fileList中
            fileList=os.listdir(self.down)
            #获取当前时间并转换为字符串格式
            now=time.strftime("%Y%m%d%H%M%S")
            #判断fileList是否为空，不为空执行if模块
            if  fileList:
                #对fileList中的元素进行循环
                for item in fileList:
                    try:
                        #查找下载路径下是否存在result.csv文件
                        #如果存在，对文件进行重命名
                        if re.match("result\.csv",item):
                            #获取带有路径的result.csv文件
                            oldfilename=self.down+"\\"+item
                            #重命名的命令格式是符：20170306143330.csv
                            newfileName=self.down+"\\"+now+".csv"
                            #对文件result.csv进行重命名为格式如：20170306143330.csv
                            os.rename(oldfilename,newfileName)
                            #把重命名的文件移动到保存路径下
                            shutil.move(newfileName, self.save) 
                    except Exception,e:
                        print e                                                   
        except Exception,e:
            print e
    
    def getLastReqement(self):
        try:
            #获取下载路径下的所有文件
            #并把文件保存在list变量listfile中
            listfile=os.listdir(self.save)
            #判断listfile是否为空，不为空执行if模块
            if len(listfile)>1:
                #保存带有路径的最新文件
                try:
                    self.diffPath=self.config.getDiffPath()
                    if os.path.exists(self.diffPath):
                        pass
                    else:
                        os.mkdir(self.diffPath)
                        self.diffPath=self.diffPath
        
                    lastfile=self.save+"\\"+listfile[-1]
                    #获取第二个比较新的文档
                    twofile=self.save+"\\"+listfile[-2]                    
                    lastestFile=open(lastfile,"r").readlines()
                    secondLastestFile=open(twofile,"r").readlines()
                    diffInf=difflib.ndiff(lastestFile,secondLastestFile)
                    diffHtml=open(self.diffPath+"\\"+"diff.log","w")             
                    diffHtml.writelines(diffInf)                        
                except Exception,e:
                        print "Save Diff Error:",e              
        except Exception,e:
            print e         
                  
                    
def test():
    ''''''
    创建一个测试脚本，执行冒烟测试。
    用来验证程序功能能正常运行。
    ''''''
    #创建一个Config对象实例
    fm=FileManger()
    #fm.clearResultCSV()
    
    fm.moveCSVToSave()
    print fm.getLastFileWithPath(),os.listdir(fm.save)
    
if __name__=="__main__":
    test()

linux下对比两个文件夹下python文件的差异

1. 代码如下

#!/bin/bash
##########################################################
# Filename      : pyDiff
# Description   : 查看文件夹下python文件的不同, use like: 
#                 pyDiff dir1 dir2
#                 参考：https://vi.stackexchange.com/questions/778/how-to-diff-and-merge-two-directories
# #######################################################

#######################################
#
# r 表示 颜色是红色，支持黑红绿黄蓝白
# 默认为红色
#
#######################################

function color {
    case "$2" in 
    k) echo -e "\033[30m${1}\033[0m";;
    r) echo -e "\033[31m${1}\033[0m";; 
    g) echo -e "\033[32m${1}\033[0m";;   
    y) echo -e "\033[33m${1}\033[0m";;   
    b) echo -e "\033[34m${1}\033[0m";;   
    *) echo -e "\033[37m${1}\033[0m"
    esac
}

function colorEcho {
    if [ -z "$2" ]; then
        c='r'
    else
        c=$2
    fi

    color "$1" "$c"
}

function Diff() {
    local dir1
    local dir2
    dir1=$1
    dir2=$2

    if [ -d "$dir1" ] && [ -d "$dir2" ]; then
        for files in $(diff -rq $dir1 $dir2|grep 'differ$'|sed "s/^Files //g;s/ differ$//g;s/ and /:/g"); do 
            if [[ "${files%:*}" == *.py ]] && [[ "${files#*:}" == *.py ]]; then
                echo 'File with diff: ' ${files%:*} ' <---> ' ${files#*:}; 
            fi
        done
        local y
        read -p "Show the diffs with vimdiff, y or n? " -n 1 y
        if [ "$y" != "y" ]; then 
            exit 0
        fi
        for files in $(diff -rq $dir1 $dir2|grep 'differ$'|sed "s/^Files //g;s/ differ$//g;s/ and /:/g"); do 
            if [[ "${files%:*}" == *.py ]] && [[ "${files#*:}" == *.py ]]; then
                vimdiff ${files%:*} ${files#*:}; 
            fi
        done
    elif [ -f "$dir1" ] && [ -f "$dir2" ]; then 
        vimdiff $1 $2 
    else
        echo "$1 $2"
        colorEcho "Error!!! \$1 and \$2 must with same type( dir or file)" r
        exit 1
    fi
}

Diff $1 $2

2.用法如下：

命令行执行./pyDiff dir1 dir2

linux对比两个文件的差异

在项目维护阶段，经常会对垃圾文件进行清理。比如没有在数据库中的文件进行删除，这个时候最好的选择就是使用shell命令了；废话不多说直接上代码：

1.首先准备好从数据表导出来的数据，方法随意

2.在服务器查看指定目录下所有文件的文件名，并生成文件。　　　

　　ls *.* >***.txt

3.对比两个文件的文件内容不同的部分，并且删除　

#!/bin/sh
#BEGIN
cat test1.txt | sort | uniq | sort > a_u.txt
cat test2.txt | sort | uniq | sort > b_u.txt
diff a_u.txt  b_u.txt > c.txt
for x in  ` awk ‘{print $2}‘ c.txt `
{
        rm -rf $x;
}
#echo filename
# END

此刻大功告成！！！

注意：请不要在window下边界shell文件，有可能出现编码问题造成文件名后缀出现？等乱码情况。

python difflib模块示例讲解

difflib模块提供的类和方法用来进行序列的差异化比较，它能够比对文件并生成差异结果文本或者html格式的差异化比较页面，如果需要比较目录的不同，可以使用filecmp模块。

class difflib.SequenceMatcher

此类提供了比较任意可哈希类型序列对方法。此方法将寻找没有包含‘垃圾'元素的最大连续匹配序列。

通过对算法的复杂度比较，它由于原始的完形匹配算法，在最坏情况下有n的平方次运算，在最好情况下，具有线性的效率。

它具有自动垃圾启发式，可以将重复超过片段1%或者重复200次的字符作为垃圾来处理。可以通过将autojunk设置为false关闭该功能。

class difflib.Differ

此类比较的是文本行的差异并且产生适合人类阅读的差异结果或者增量结果，结果中各部分的表示如下：

这里写图片描述

class difflib.HtmlDiff

此类可以被用来创建HTML表格 (或者说包含表格的html文件) ，两边对应展示或者行对行的展示比对差异结果。

make_file(fromlines,tolines [,fromdesc][,todesc][,context][,numlines])

make_table(fromlines,numlines])

以上两个方法都可以用来生成包含一个内容为比对结果的表格的html文件，并且部分内容会高亮显示。

difflib.context_diff(a,b[,fromfile][,tofile][,fromfiledate][,tofiledate][,n][,lineterm])

比较a与b(字符串列表)，并且返回一个差异文本行的生成器
示例：

>>> s1 = ['bacon\n','eggs\n','ham\n','guido\n']
>>> s2 = ['python\n','eggy\n','hamster\n','guido\n']
>>> for line in context_diff(s1,s2,fromfile='before.py',tofile='after.py'):
...   sys.stdout.write(line) 
*** before.py
--- after.py
***************
*** 1,4 ****
! bacon
! eggs
! ham
 guido
--- 1,4 ----
! python
! eggy
! hamster
 guido

difflib.get_close_matches(word,possibilities[,cutoff])

返回最大匹配结果的列表

示例：

>>> get_close_matches('appel',['ape','apple','peach','puppy'])
['apple','ape']
>>> import keyword
>>> get_close_matches('wheel',keyword.kwlist)
['while']
>>> get_close_matches('apple',keyword.kwlist)
[]
>>> get_close_matches('accept',keyword.kwlist)
['except']

difflib.ndiff(a,linejunk][,charjunk])

比较a与b(字符串列表)，返回一个Differ-style 的差异结果
示例：

>>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1),...       'ore\ntree\nemu\n'.splitlines(1))
>>> print ''.join(diff),- one
? ^
+ ore
? ^
- two
- three
? -
+ tree
+ emu

difflib.restore(sequence,which)

返回一个由两个比对序列产生的结果

示例

>>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1),...       'ore\ntree\nemu\n'.splitlines(1))
>>> diff = list(diff) # materialize the generated delta into a list
>>> print ''.join(restore(diff,1)),one
two
three
>>> print ''.join(restore(diff,2)),ore
tree
emu

difflib.unified_diff(a,lineterm])

比较a与b(字符串列表)，返回一个unified diff格式的差异结果.

示例：

>>> s1 = ['bacon\n','guido\n']
>>> for line in unified_diff(s1,tofile='after.py'):
...  sys.stdout.write(line) 
--- before.py
+++ after.py
@@ -1,4 +1,4 @@
-bacon
-eggs
-ham
+python
+eggy
+hamster
 guido

实际应用示例

比对两个文件，然后生成一个展示差异结果的HTML文件

#coding:utf-8
'''
file:difflibeg.py
date:2017/9/9 10:33
author:lockey
email:lockey@123.com
desc:diffle module learning and practising 
'''
import difflib
hd = difflib.HtmlDiff()
loads = ''
with open('G:/python/note/day09/0907code/hostinfo/cpu.py','r') as load:
 loads = load.readlines()
 load.close()

mems = ''
with open('G:/python/note/day09/0907code/hostinfo/mem.py','r') as mem:
 mems = mem.readlines()
 mem.close()

with open('htmlout.html','a+') as fo:
 fo.write(hd.make_file(loads,mems))
 fo.close()

运行结果：

这里写图片描述

生成的html文件比对结果：

这里写图片描述

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持编程小技巧。

python difflib模块详解

这篇文章主要为大家详细介绍了python difflib模块的示例，具有一定的参考价值，感兴趣的小伙伴们可以参考一下

class difflib.SequenceMatcher

此类提供了比较任意可哈希类型序列对方法。此方法将寻找没有包含‘垃圾''元素的最大连续匹配序列。

通过对算法的复杂度比较，它由于原始的完形匹配算法，在最坏情况下有n的平方次运算，在最好情况下，具有线性的效率。

立即学习“Python免费学习笔记（深入）”；

它具有自动垃圾启发式，可以将重复超过片段1%或者重复200次的字符作为垃圾来处理。可以通过将autojunk设置为false关闭该功能。

class difflib.Differ

此类比较的是文本行的差异并且产生适合人类阅读的差异结果或者增量结果，结果中各部分的表示如下：

这里写图片描述

class difflib.HtmlDiff

此类可以被用来创建HTML表格 (或者说包含表格的html文件) ，两边对应展示或者行对行的展示比对差异结果。

make_file(fromlines, tolines [, fromdesc][, todesc][, context][, numlines])

make_table(fromlines, tolines [, fromdesc][, todesc][, context][, numlines])

以上两个方法都可以用来生成包含一个内容为比对结果的表格的html文件，并且部分内容会高亮显示。

difflib.context_diff(a, b[, fromfile][, tofile][, fromfiledate][, tofiledate][, n][, lineterm])

比较a与b(字符串列表)，并且返回一个差异文本行的生成器
示例：

>>> s1 = [&#39;bacon\n&#39;, &#39;eggs\n&#39;, &#39;ham\n&#39;, &#39;guido\n&#39;]
>>> s2 = [&#39;python\n&#39;, &#39;eggy\n&#39;, &#39;hamster\n&#39;, &#39;guido\n&#39;]
>>> for line in context_diff(s1, s2, fromfile=&#39;before.py&#39;, tofile=&#39;after.py&#39;):
...   sys.stdout.write(line) 
*** before.py
--- after.py
***************
*** 1,4 ****
! bacon
! eggs
! ham
 guido
--- 1,4 ----
! python
! eggy
! hamster
 guido

登录后复制

difflib.get_close_matches(word, possibilities[, n][, cutoff])

返回最大匹配结果的列表

示例：

>>> get_close_matches(&#39;appel&#39;, [&#39;ape&#39;, &#39;apple&#39;, &#39;peach&#39;, &#39;puppy&#39;])
[&#39;apple&#39;, &#39;ape&#39;]
>>> import keyword
>>> get_close_matches(&#39;wheel&#39;, keyword.kwlist)
[&#39;while&#39;]
>>> get_close_matches(&#39;apple&#39;, keyword.kwlist)
[]
>>> get_close_matches(&#39;accept&#39;, keyword.kwlist)
[&#39;except&#39;]

登录后复制

difflib.ndiff(a, b[, linejunk][, charjunk])

比较a与b(字符串列表)，返回一个Differ-style 的差异结果
示例：

>>> diff = ndiff(&#39;one\ntwo\nthree\n&#39;.splitlines(1),
...       &#39;ore\ntree\nemu\n&#39;.splitlines(1))
>>> print &#39;&#39;.join(diff),
- one
? ^
+ ore
? ^
- two
- three
? -
+ tree
+ emu

登录后复制

difflib.restore(sequence, which)

返回一个由两个比对序列产生的结果

示例

>>> diff = ndiff(&#39;one\ntwo\nthree\n&#39;.splitlines(1),
...       &#39;ore\ntree\nemu\n&#39;.splitlines(1))
>>> diff = list(diff) # materialize the generated delta into a list
>>> print &#39;&#39;.join(restore(diff, 1)),
one
two
three
>>> print &#39;&#39;.join(restore(diff, 2)),
ore
tree
emu

登录后复制

difflib.unified_diff(a, b[, fromfile][, tofile][, fromfiledate][, tofiledate][, n][, lineterm])

比较a与b(字符串列表)，返回一个unified diff格式的差异结果.

示例：

>>> s1 = [&#39;bacon\n&#39;, &#39;eggs\n&#39;, &#39;ham\n&#39;, &#39;guido\n&#39;]
>>> s2 = [&#39;python\n&#39;, &#39;eggy\n&#39;, &#39;hamster\n&#39;, &#39;guido\n&#39;]
>>> for line in unified_diff(s1, s2, fromfile=&#39;before.py&#39;, tofile=&#39;after.py&#39;):
...  sys.stdout.write(line) 
--- before.py
+++ after.py
@@ -1,4 +1,4 @@
-bacon
-eggs
-ham
+python
+eggy
+hamster
 guido

登录后复制

实际应用示例

比对两个文件，然后生成一个展示差异结果的HTML文件

#coding:utf-8
&#39;&#39;&#39;
file:difflibeg.py
date:2017/9/9 10:33
author:lockey
email:lockey@123.com
desc:diffle module learning and practising 
&#39;&#39;&#39;
import difflib
hd = difflib.HtmlDiff()
loads = &#39;&#39;
with open(&#39;G:/python/note/day09/0907code/hostinfo/cpu.py&#39;,&#39;r&#39;) as load:
 loads = load.readlines()
 load.close()

mems = &#39;&#39;
with open(&#39;G:/python/note/day09/0907code/hostinfo/mem.py&#39;, &#39;r&#39;) as mem:
 mems = mem.readlines()
 mem.close()

with open(&#39;htmlout.html&#39;,&#39;a+&#39;) as fo:
 fo.write(hd.make_file(loads,mems))
 fo.close()

登录后复制

运行结果：

这里写图片描述

生成的html文件比对结果：

这里写图片描述

以上就是python difflib模块详解的详细内容，更多请关注php中文网其它相关文章！

今天关于Python使用difflib对比两个文件操作实例和python如何对比两个文件内容不同的讲解已经结束，谢谢您的阅读，如果想了解更多关于linux下对比两个文件夹下python文件的差异、linux对比两个文件的差异、python difflib模块示例讲解、python difflib模块详解的相关知识，请在本站搜索。

本文标签：