How can I get diff to show only added and deleted lines? If diff can't do it, what tool can?
11 Answers
Try comm
Another way to look at it:
Show lines that only exist in file a: (i.e. what was deleted from a)
comm -23 a bShow lines that only exist in file b: (i.e. what was added to b)
comm -13 a bShow lines that only exist in one file or the other: (but not both)
comm -3 a b | sed 's/^\t//'
(Warning: If file a has lines that start with TAB, it (the first TAB) will be removed from the output.)
Sorted files only
NOTE: Both files need to be sorted for comm to work properly. If they aren't already sorted, you should sort them:
sort <a >a.sorted
sort <b >b.sorted
comm -12 a.sorted b.sorted
If the files are extremely long, this may be quite a burden as it requires an extra copy and therefore twice as much disk space.
Or if you use a modern shell:
comm -12 <(sort a) <(sort b)
- 8,381
To show additions and deletions without context, line numbers, +, -, <, > ! etc, you can use diff like this:
diff --changed-group-format='%<%>' --unchanged-group-format='' a.txt b.txt
For example, given two files:
a.txt
Common
Common
A-ONLY
Common
b.txt
Common
B-ONLY
Common
Common
The following command will show lines either removed from a or added to b:
diff --changed-group-format='%<%>' --unchanged-group-format='' a.txt b.txt
output:
B-ONLY
A-ONLY
This slightly different command will show lines removed from a.txt:
diff --changed-group-format='%<' --unchanged-group-format='' a.txt b.txt
output:
A-ONLY
Finally, this command will show lines added to a.txt
diff --changed-group-format='%>' --unchanged-group-format='' a.txt b.txt
output
B-ONLY
- 331
comm might do what you want. From its man page:
DESCRIPTION
Compare sorted files FILE1 and FILE2 line by line.
With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files.
These columns are suppressable with -1, -2 and -3 respectively.
Example:
[root@dev ~]# cat a
common
shared
unique
[root@dev ~]# cat b
common
individual
shared
[root@dev ~]# comm -3 a b
individual
unique
And if you just want the unique lines and don't care which file they're in:
[root@dev ~]# comm -3 a b | sed 's/^\t//'
individual
unique
As the man page says, the files must be sorted beforehand.
- 2,045
- 2,449
Visual comparison tools fit two files together so that a segment with the same number of lines but differing content will be considered a changed segment. Completely new lines between matching segments are considered added segments.
This is also how sdiff command-line tool works, which shows a side-by-side comparison of two files in a terminal. Changed lines are separated by | character. If a line exists only in file A, < is used as the separator character. If a line exists only in file B, > is used as the separator. If you don't have < and > characters in the files, you can use this to show only added lines:
sdiff A B | grep '[<>]'
- 148
No, diff doesn't actually show the differences between two files in the way one might think. It produces a sequence of editing commands for a tool like patch to use to change one file into another.
The difficulty for any attempt at doing what you're looking for is how to define what constitutes a line that has changed versus a deleted one followed by an added one. Also what to do when lines are added, deleted and changed adjacent to each other.
- 64,083
Thanks senarvi, your solution (not voted for) actually gave me EXACTLY what I wanted after looking for ages on a ton of pages.
Using your answer, here is what I came up with to get the list of things changed/added/deleted. The example uses 2 versions of the /etc/passwd file and prints out the username for the relevant records.
#!/bin/bash
sdiff passwd1 passwd2 | grep '[|]' | awk -F: '{print "changed: " $1}'
sdiff passwd1 passwd2 | grep '[<]' | awk -F: '{print "deleted: " $1}'
sdiff passwd1 passwd2 | grep '[>]' | awk -F\> '{print $2}' | awk -F: '{print "added: " $1}'
- 21
That's what diff does by default... Maybe you need to add some flags to ignore whitespace?
diff -b -B
should ignore blank lines and different numbers of spaces.
- 2,394
I find it simplest to use grep:
Added lines:
grep -xvFf filea.txt fileb.txt
Removed lines:
grep -xvFf fileb.txt filea.txt
-x: match whole lines
-v: lines NOT matching the pattern(s)
-F: treat patterns as fixed strings, not regular expressions
-f <otherfile>: read the list of patterns from a file
- 111
I find this particular form often useful:
diff --changed-group-format='-%<+%>' --unchanged-group-format='' f g
Example:
printf 'a\nb\nc\nd\ne\nf\ng\n' > f
printf 'a\nB\nC\nd\nE\nF\ng\n' > g
diff --old-line-format=$'-%l\n' \
--new-line-format=$'+%l\n' \
--unchanged-line-format='' \
f g
Output:
-b
-c
+B
+C
-e
-f
+E
+F
So it shows old lines with - followed immediately by the corresponding new line with +.
If we had a deletion of C:
printf 'a\nb\nd\ne\nf\ng\n' > f
printf 'a\nB\nC\nd\nE\nF\ng\n' > g
diff --old-line-format=$'-%l\n' \
--new-line-format=$'+%l\n' \
--unchanged-line-format='' \
f g
it looks like this:
-b
+B
+C
-e
-f
+E
+F
The format is documented at man diff:
--line-format=LFMT
format all input lines with LFMT`
and:
LTYPE is 'old', 'new', or 'unchanged'.
GTYPE is LTYPE or 'changed'.
and:
LFMT (only) may contain:
%L contents of line
%l contents of line, excluding any trailing newline
[...]
Related question: https://stackoverflow.com/questions/15384818/how-to-get-the-difference-only-additions-between-two-files-in-linux
Tested in Ubuntu 18.04.
We can combine diff and sed to achieve what you want. lets take the same example from https://serverfault.com/a/68717/947477
[root@dev ~]# cat file1
common
shared
unique
[root@dev ~]# cat file2
common
individual
shared
To show added lines with + and deleted lines with - we can use
root@dev ~]# diff -u file1 file2 |sed -n '/^\(+\|-\)/p'
--- a 2022-03-25 18:30:57.507551352 +0530
+++ b 2022-03-25 18:31:15.087860053 +0530
-shared
-unique
+individual
Here, -u is for printing unified content and sed will filter only outputs with - or + at the beginning.
A more straightforward answer is
diff file1 file2
< shared
< unique
---
> individual
- 101
- 2
File1:
text670_1
text067_1
text067_2
File2:
text04_1
text04_2
text05_1
text05_2
text067_1
text067_2
text1000_1
Use:
diff -y file1 file2
This show two columns for repectives files.
Output:
text670_1
> text04_1
> text04_2
> text05_1
> text05_2
text067_1 text67_1
text067_2 text67_2
> text1000_1
- 97