join 2.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147
  1. .TH JOIN 1
  2. .CT 1 files
  3. .SH NAME
  4. join \- relational database operator
  5. .SH SYNOPSIS
  6. .B join
  7. [
  8. .I options
  9. ]
  10. .I file1 file2
  11. .SH DESCRIPTION
  12. .I Join
  13. forms, on the standard output,
  14. a join
  15. of the two relations specified by the lines of
  16. .I file1
  17. and
  18. .IR file2 .
  19. If one of the file names is
  20. .LR - ,
  21. the standard input is used.
  22. .PP
  23. .I File1
  24. and
  25. .I file2
  26. must be sorted in increasing
  27. .SM ASCII
  28. collating
  29. sequence on the fields
  30. on which they are to be joined,
  31. normally the first in each line.
  32. .PP
  33. There is one line in the output
  34. for each pair of lines in
  35. .I file1
  36. and
  37. .I file2
  38. that have identical join fields.
  39. The output line normally consists of the common field,
  40. then the rest of the line from
  41. .IR file1 ,
  42. then the rest of the line from
  43. .IR file2 .
  44. .PP
  45. Input fields are normally separated spaces or tabs;
  46. output fields by space.
  47. In this case, multiple separators count as one, and
  48. leading separators are discarded.
  49. .PP
  50. The following options are recognized, with POSIX syntax.
  51. .TP
  52. .BI -a " n
  53. In addition to the normal output,
  54. produce a line for each unpairable line in file
  55. .IR n ,
  56. where
  57. .I n
  58. is 1 or 2.
  59. .TP
  60. .BI -v " n
  61. Like
  62. .BR -a ,
  63. omitting output for paired lines.
  64. .TP
  65. .BI -e " s
  66. Replace empty output fields by string
  67. .IR s .
  68. .TP
  69. .BI -1 " m
  70. .br
  71. .ns
  72. .TP
  73. .BI -2 " m
  74. Join on the
  75. .IR m th
  76. field of
  77. .I file1
  78. or
  79. .IR file2 .
  80. .TP
  81. .BI -j "n m"
  82. Archaic equivalent for
  83. .BI - n " m"\f1.
  84. .TP
  85. .BI -o fields
  86. Each output line comprises the designated fields.
  87. The comma-separated field designators are either
  88. .BR 0 ,
  89. meaning the join field, or have the form
  90. .IR n . m ,
  91. where
  92. .I n
  93. is a file number and
  94. .I m
  95. is a field number.
  96. Archaic usage allows separate arguments for field designators.
  97. .PP
  98. .TP
  99. .BI -t c
  100. Use character
  101. .I c
  102. as the only separator (tab character) on input and output.
  103. Every appearance of
  104. .I c
  105. in a line is significant.
  106. .SH EXAMPLES
  107. .TP
  108. .L
  109. sort -t: +1 /adm/users | join -t: -1 2 -a 1 -e "" - bdays
  110. Add birthdays to the
  111. .B /adm/users
  112. file, leaving unknown
  113. birthdays empty.
  114. The layout of
  115. .B /adm/users
  116. is given in
  117. .IR users (6);
  118. .B bdays
  119. contains sorted lines like
  120. .LR "ken:Feb\ 4,\ 1953" .
  121. .TP
  122. .L
  123. tr : ' ' </adm/users | sort -k 3 3 >temp
  124. .br
  125. .ns
  126. .TP
  127. .L
  128. join -1 3 -2 3 -o 1.1,2.1 temp temp | awk '$1 < $2'
  129. Print all pairs of users with identical userids.
  130. .SH SOURCE
  131. .B /sys/src/cmd/join.c
  132. .SH "SEE ALSO"
  133. .IR sort (1),
  134. .IR comm (1),
  135. .IR awk (1)
  136. .SH BUGS
  137. With default field separation,
  138. the collating sequence is that of
  139. .BI "sort -b"
  140. .BI -k y , y\f1;
  141. with
  142. .BR -t ,
  143. the sequence is that of
  144. .BI "sort -t" x
  145. .BI -k y , y\f1.
  146. .PP
  147. One of the files must be randomly accessible.