[Swift]LeetCode609. 在系统中查找重复文件 | Find Duplicate File in System（total commander查找重复文件）

25-04-17 5

本文将带您了解关于[Swift]LeetCode609.在系统中查找重复文件|FindDuplicateFileinSystem的新内容，同时我们还将为您解释totalcommander查找重复文件的

本文将带您了解关于[Swift]LeetCode609. 在系统中查找重复文件 | Find Duplicate File in System的新内容，同时我们还将为您解释total commander查找重复文件的相关知识，另外，我们还将为您提供关于$file=isset($file) && $file?$file:''index''这段代码什么意思？、$file=isset($file) && $file?$file:''index''这段代码什么意思？该怎么处理、''Could not find first log file name in binary log index file'' 的解决办法、adb 传输文件的权限问题——remote couldn''t create file: Read-only file system的实用信息。

本文目录一览：

[Swift]LeetCode609. 在系统中查找重复文件 | Find Duplicate File in System（total commander查找重复文件）
$file=isset($file) && $file?$file:''index''这段代码什么意思？
$file=isset($file) && $file?$file:''index''这段代码什么意思？该怎么处理
''Could not find first log file name in binary log index file'' 的解决办法
adb 传输文件的权限问题——remote couldn''t create file: Read-only file system

[Swift]LeetCode609. 在系统中查找重复文件 | Find Duplicate File in System（total commander查找重复文件）

Given a list of directory info including directory path,and all the files with contents in this directory,you need to find out all the groups of duplicate files in the file system in terms of their paths.

A group of duplicate files consists of at least two files that have exactly the same content.

A single directory info string in the input list has the following format:

"root/d1/d2/.../dm f1.txt(f1_content) f2.txt(f2_content) ... fn.txt(fn_content)"

It means there are n files (f1.txt, f2.txt ... fn.txt with content f1_content, f2_content ... fn_content,respectively) in directory root/d1/d2/.../dm. Note that n >= 1 and m >= 0. If m = 0,it means the directory is just the root directory.

The output is a list of group of duplicate file paths. For each group,it contains all the file paths of the files that have the same content. A file path is a string that has the following format:

"directory_path/file_name.txt"

Example 1:

Input:
["root/a 1.txt(abcd) 2.txt(efgh)","root/c 3.txt(abcd)","root/c/d 4.txt(efgh)","root 4.txt(efgh)"]
Output:  
[["root/a/2.txt","root/c/d/4.txt","root/4.txt"],["root/a/1.txt","root/c/3.txt"]]

Note:

No order is required for the final output.
You may assume the directory name,file name and file content only has letters and digits,and the length of file content is in the range of [1,50].
The number of files given is in the range of [1,20000].
You may assume no files or directories share the same name in the same directory.
You may assume each given directory info represents a unique directory. Directory path and file info are separated by a single blank space.

Follow-up beyond contest:

Imagine you are given a real file system,how will you search files? DFS or BFS?
If the file content is very large (GB level),how will you modify your solution?
If you can only read the file by 1kb each time,how will you modify your solution?
What is the time complexity of your modified solution? What is the most time-consuming part and memory consuming part of it? How to optimize?
How to make sure the duplicated files you find are not false positive?

给定一个目录信息列表，包括目录路径，以及该目录中的所有包含内容的文件，您需要找到文件系统中的所有重复文件组的路径。一组重复的文件至少包括二个具有完全相同内容的文件。

输入列表中的单个目录信息字符串的格式如下：

"root/d1/d2/.../dm f1.txt(f1_content) f2.txt(f2_content) ... fn.txt(fn_content)"

这意味着有 n 个文件（f1.txt, f2.txt ... fn.txt 的内容分别是 f1_content, f2_content ... fn_content）在目录 root/d1/d2/.../dm 下。注意：n>=1 且 m>=0。如果 m=0，则表示该目录是根目录。

该输出是重复文件路径组的列表。对于每个组，它包含具有相同内容的文件的所有文件路径。文件路径是具有下列格式的字符串：

"directory_path/file_name.txt"

示例 1：

输入：
["root/a 1.txt(abcd) 2.txt(efgh)","root 4.txt(efgh)"]
输出：  
[["root/a/2.txt","root/c/3.txt"]]

注：

最终输出不需要顺序。
您可以假设目录名、文件名和文件内容只有字母和数字，并且文件内容的长度在 [1，50] 的范围内。
给定的文件数量在 [1，20000] 个范围内。
您可以假设在同一目录中没有任何文件或目录共享相同的名称。
您可以假设每个给定的目录信息代表一个唯一的目录。目录路径和文件信息用一个空格分隔。

超越竞赛的后续行动：

假设您有一个真正的文件系统，您将如何搜索文件？广度搜索还是宽度搜索？
如果文件内容非常大（GB级别），您将如何修改您的解决方案？
如果每次只能读取 1 kb 的文件，您将如何修改解决方案？
修改后的解决方案的时间复杂度是多少？其中最耗时的部分和消耗内存的部分是什么？如何优化？
如何确保您发现的重复文件不是误报？

584ms

 1 class Solution {
 2     func findDuplicate(_ paths: [String]) -> [[String]] {
 3         var ctp = [String: [String]]()
 4         for path in paths {
 5             let comps = path.split(separator: " ")
 6             let dir = comps[0]
 7             for file in comps[1...] {
 8                 let (fullPath,contents) = analyzefile(file,dir)
 9                 ctp[contents,default: []].append(fullPath)
10             }
11         }
12         return Array(ctp.values).filter { $0.count > 1 }
13     }
14     
15     private func analyzefile(_ file: String.SubSequence,_ dir: String.SubSequence) -> (String,String) {
16         let startIndex = file.index(of: "(")!
17         let endindex = file.index(before: file.endindex)
18         let contents = String(file[file.index(after: startIndex)..<endindex])
19         let path = String(file[file.startIndex..<startIndex])
20         return ("\(dir)/\(path)",contents)
21     }
22 }

700ms

 1 class Solution {
 2     func findDuplicate(_ paths: [String]) -> [[String]] {
 3         var map = [String: [String]]()
 4         for path in paths {
 5             let contents = extractContent(path)
 6             for content in contents {
 7                 map[content[1],default: [String]()].append(content[0])
 8             }
 9         }
10         var result = [[String]]()
11         for key in map.keys {
12             if let tmpPaths = map[key] {
13                 if tmpPaths.count > 1 {
14                     result.append(tmpPaths)
15                 }
16             }
17         }
18         return result
19     }
20     
21     private func extractContent(_ str: String) -> [[String]] {
22         let arr = str.split(separator: " ")
23         let root = arr[0]
24         var result = [[String]]()
25         for i in 1 ..< arr.count {
26             let str = arr[i]
27             let left = str.firstIndex(of: "(")!
28             let right = str.lastIndex(of: ")")!
29             let filename = String(str[..<left])
30             let content = String(str[left ... right])
31             result.append(["\(root)/\(filename)",content])
32         }
33         return result
34     }
35 }

908ms

 1 class Solution {
 2     
 3     typealias Content = String
 4     typealias FilePath = String
 5     typealias ExtractedContentPath = (content: Content,filepath: FilePath)
 6     
 7     func findDuplicate(_ paths: [String]) -> [[String]] {
 8         
 9         let contentFileTable: [Content: [FilePath]] = paths.lazy
10             .flatMap { self.extractedContentPaths($0) }
11             .reduce(into: [Content: [FilePath]]()){ (dict: inout [Content: [FilePath]],extractedContentPath: ExtractedContentPath) in
12                 dict[extractedContentPath.content,default: [FilePath]()].append(extractedContentPath.filepath)
13         }
14         
15         return contentFileTable.values.filter { $0.count > 1 }
16         
17     }
18     
19     private func extractedContentPaths(_ input: String) -> [ExtractedContentPath] {
20         let tokens = input.components(separatedBy: .whitespaces)
21         let directory = tokens.first!
22         return tokens.indices.dropFirst()
23             .lazy.map { extractedContentPath(from: tokens[$0],directory: directory) }
24     }
25     
26     private func extractedContentPath(from fileAndContent: String,directory: String) -> ExtractedContentPath {
27         let tokens = fileAndContent.dropLast().components(separatedBy: "(")
28         return ExtractedContentPath(content: tokens.last!,filepath: "\(directory)/\(tokens.first!)")
29     }
30 }

992ms

 1 class Solution {
 2     func findDuplicate(_ paths: [String]) -> [[String]] {
 3         var groups = [String: Array<String>]()
 4         
 5         for info in paths {
 6             let list = parse(info: info)
 7             
 8             for file in list {
 9                 groups[file.1,default: Array<String>()].append(file.0)
10             }
11         }
12         
13         var result = [[String]]()
14         
15         for group in Array(groups.values) {
16             if group.count > 1 {
17                 result.append(group)
18             }
19         }
20         
21         return result
22     }
23     
24     func parse(info: String) -> [(String,String)] {
25         var components = info.components(separatedBy: " ")
26         let path = components.removeFirst()
27         
28         var result = [(String,String)]()
29         
30         let splitCharSet = CharacterSet(charactersIn: "()")
31         while !components.isEmpty {
32             
33             let entry = components.removeFirst()
34             let parts = entry.components(separatedBy: splitCharSet)
35             
36             let file = "\(path)/\(parts[0])" 
37             let contents = parts[1]
38             
39             result.append((file,contents))
40         }
41         
42         return result
43     }
44 }

1232ms

 1 class Solution {
 2     func findDuplicate(_ paths: [String]) -> [[String]] {
 3         var fileMapPaths = [String: [String]]()
 4         paths.forEach { (str) in
 5             let arrStrs = str.split(separator: " ")
 6             let dir = arrStrs[0]
 7             for i in 1..<arrStrs.count {
 8                 let fileInfo = arrStrs[i]
 9                 let subArrStr = fileInfo.split(separator: "(")
10                 let md5 = String(subArrStr[1])
11                 let file = String(subArrStr[0].prefix(subArrStr[0].count))
12                 let filePath = dir + "/" + file
13                 var mapPaths = fileMapPaths[md5] ?? [String]()
14                 mapPaths.append(filePath)
15                 fileMapPaths[md5] = mapPaths
16             }
17         }
18         let ans = fileMapPaths.values.filter { $0.count > 1}
19         return ans        
20     }
21 }

1264ms

 1 class Solution {
 2     func findDuplicate(_ paths: [String]) -> [[String]] {
 3         //create a dictionary with [content: [filepath]],output the value count which is equal and greater than 2
 4         var contentToFiles = [String: [String]]()
 5         for path in paths {
 6             let params = path.split(separator: " ")
 7             guard let dir = params.first else {
 8                 continue
 9             }
10             for i in 1 ..< params.count {
11                 let fileParams = params[i].split(separator: "(")
12                 guard let fileName = fileParams.first,let fileContentWithExtraInfo = fileParams.last else {
13                     continue
14                 }
15                 let fileContent = String(describing: fileContentWithExtraInfo.dropLast())
16                 let filePath = String(describing:dir) + "/" + String(describing: fileName )
17                 contentToFiles[fileContent] = contentToFiles[fileContent,default:[]] + [filePath]  
18             }
19         }
20         return contentToFiles.values.filter{$0.count >= 2}
21 
22     }
23 }

Runtime: 1324 ms

Memory Usage: 26.4 MB

 1 class Solution {
 2     func findDuplicate(_ paths: [String]) -> [[String]] {
 3         var ans:[[String]] = [[String]]()
 4         var map:[String:[String]] = [String:[String]]()
 5         for path in paths
 6         {
 7             var temp:[String] = path.components(separatedBy:" ")
 8             var root:String = temp[0]
 9             for str in temp
10             {
11                 var begin:Int = str.findLast("(")
12                 var end:Int = str.findFirst(")")
13                 if begin + 1 < end
14                 {
15                     var name:String = str.subString(begin+1,end)
16                     var s:String = root + "/" + str.subStringTo(begin - 1)
17                     if map[name] == nil
18                     {
19                         map[name] = [s]
20                     }
21                     else
22                     {
23                         map[name]!.append(s)
24                     }
25                 }               
26             } 
27         }
28         for val in map.values
29         {
30             if val.count > 1
31             {
32                 ans.append(val)
33             }
34         }
35         return ans        
36     }
37 }
38 
39 //String扩展
40 extension String {    
41     // 截取字符串：从起始处到index
42     // - Parameter index: 结束索引
43     // - Returns: 子字符串
44     func subStringTo(_ index: Int) -> String {
45         let theIndex = self.index(self.startIndex,offsetBy:min(self.count,index))
46         return String(self[startIndex...theIndex])
47     }
48     
49     // 截取字符串：指定索引和字符数
50     // - begin: 开始截取处索引
51     // - count: 截取的字符数量
52     func subString(_ begin:Int,_ count:Int) -> String {
53         let start = self.index(self.startIndex,offsetBy: max(0,begin))
54         let end = self.index(self.startIndex,offsetBy:  min(self.count,begin + count))
55         return String(self[start..<end]) 
56     }
57     
58     //从0索引处开始查找是否包含指定的字符串，返回Int类型的索引
59     //返回第一次出现的指定子字符串在此字符串中的索引
60     func findFirst(_ sub:String)->Int {
61         var pos = -1
62         if let range = range(of:sub,options: .literal ) {
63             if !range.isEmpty {
64                 pos = self.distance(from:startIndex,to:range.lowerBound)
65             }
66         }
67         return pos
68     }
69 
70     //从0索引处开始查找是否包含指定的字符串，返回Int类型的索引
71     //返回最后出现的指定子字符串在此字符串中的索引
72     func findLast(_ sub:String)->Int {
73         var pos = -1
74         if let range = range(of:sub,options: .backwards ) {
75             if !range.isEmpty {
76                 pos = self.distance(from:startIndex,to:range.lowerBound)
77             }
78         }
79         return pos
80     }
81 }

1344ms

 1 class Solution {
 2     func findDuplicate(_ paths: [String]) -> [[String]] {
 3         var mapping = [String: [String]]()
 4         for i in 0..<paths.count {
 5             let arr = paths[i].components(separatedBy: " ")
 6             let base = arr[0] + "/"
 7             for j in 1..<arr.count {
 8                 let arrSplit = arr[j].components(separatedBy: "(")
 9                 mapping[arrSplit[1],default:[String]()].append(base + arrSplit[0])
10             }
11         }
12         return Array(mapping.values).filter{$0.count > 1}
13     }
14 }

$file=isset($file) && $file?$file:''index''这段代码什么意思？

$file=isset($file) && $file?$file:''index'';
上面代码什么意思？最好能举个例子，上面实际执行了什么？

回复讨论(解决方案)

判断是否存在$file变量，如果存在则取值为$file 如果不存在则为index

? : 三目运算符
等价于
if(isset($file) && $file){
$file=$file;
}else{
$file=''index'';
}

? : 三元运算符 (?)问号前面是判断条件如果条件为真则取:(冒号)前面的值如果判断条件为假则取:(冒号)后面的值

$file=isset($file) && $file?$file:''index''这段代码什么意思？该怎么处理

$file=isset($file) && $file?$file:''index''这段代码什么意思？
$file=isset($file) && $file?$file:''index'';
上面代码什么意思？最好能举个例子，上面实际执行了什么？
------解决方案--------------------
判断是否存在$file变量，如果存在则取值为$file 如果不存在则为index
------解决方案--------------------
? : 三目运算符
等价于
if(isset($file) && $file){
$file=$file;
}else{
$file=''index'';
}
------解决方案--------------------
? : 三元运算符 (?)问号前面是判断条件如果条件为真则取:(冒号)前面的值如果判断条件为假则取:(冒号)后面的值

''Could not find first log file name in binary log index file'' 的解决办法

数据库主从出错：

Slave_IO_Running: No 一方面原因是因为网络通信的问题也有可能是日志读取错误的问题。以下是日志出错问题的解决方案：

Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ''Could not find first log file name in binary log index file''

解决办法：从机器停止 slave

mysql> slave stop;

到 master 机器登陆 mysql：

记录 master 的 bin 的位置，例如：mysql> show mster status;+-------------------+----------+--------------+-------------------------------------------+| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |+-------------------+----------+--------------+-------------------------------------------+| mysqld-bin.000010 | 106 | | information_schema,mysql |+-------------------+----------+--------------+-------------------------------------------+ 日志为 mysqld-bin.000010

刷新日志：mysql> flush logs;

因为刷新日志 file 的位置会 + 1，即 File 变成为:mysqld-bin.000011

马上到 slave 执行

mysql> CHANGE MASTER TO MASTER_LOG_FILE=''mysqld-bin.000011'',MASTER_LOG_POS=106;

mysql> slave start;

mysql> show slave status\G;

adb 传输文件的权限问题——remote couldn''t create file: Read-only file system

adb root

adb remount

提示错误，于是按照搜索的结果：

执行adb disable-verity

adb reboot

再次 adb root

adb remount

adb push即可使用

关于[Swift]LeetCode609. 在系统中查找重复文件 | Find Duplicate File in System和total commander查找重复文件的介绍现已完结，谢谢您的耐心阅读，如果想了解更多关于$file=isset($file) && $file?$file:''index''这段代码什么意思？、$file=isset($file) && $file?$file:''index''这段代码什么意思？该怎么处理、''Could not find first log file name in binary log index file'' 的解决办法、adb 传输文件的权限问题——remote couldn''t create file: Read-only file system的相关知识，请在本站寻找。

本文标签：