join 2.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148
  1. .TH JOIN 1
  2. .CT 1 files
  3. .SH NAME
  4. join \- relational database operator
  5. .SH SYNOPSIS
  6. .B join
  7. [
  8. .I options
  9. ]
  10. .I file1 file2
  11. .SH DESCRIPTION
  12. .I Join
  13. forms, on the standard output,
  14. a join
  15. of the two relations specified by the lines of
  16. .I file1
  17. and
  18. .IR file2 .
  19. If one of the file names is
  20. .LR - ,
  21. the standard input is used.
  22. .PP
  23. .I File1
  24. and
  25. .I file2
  26. must be sorted in increasing
  27. .SM UTF
  28. collating
  29. sequence on the fields
  30. on which they are to be joined,
  31. normally the first in each line.
  32. .PP
  33. There is one line in the output
  34. for each pair of lines in
  35. .I file1
  36. and
  37. .I file2
  38. that have identical join fields.
  39. The output line normally consists of the common field,
  40. then the rest of the line from
  41. .IR file1 ,
  42. then the rest of the line from
  43. .IR file2 .
  44. .PP
  45. Input fields are normally separated spaces or tabs;
  46. output fields by space.
  47. In this case, multiple separators count as one, and
  48. leading separators are discarded.
  49. .PP
  50. The following options are recognized, with POSIX syntax.
  51. .TF "\fL-j\fIn m\fR "
  52. .PD
  53. .TP
  54. .BI -a " n
  55. In addition to the normal output,
  56. produce a line for each unpairable line in file
  57. .IR n ,
  58. where
  59. .I n
  60. is 1 or 2.
  61. .TP
  62. .BI -v " n
  63. Like
  64. .BR -a ,
  65. omitting output for paired lines.
  66. .TP
  67. .BI -e " s
  68. Replace empty output fields by string
  69. .IR s .
  70. .TP
  71. .BI -1 " m
  72. .br
  73. .ns
  74. .TP
  75. .BI -2 " m
  76. Join on the
  77. .IR m th
  78. field of
  79. .I file1
  80. or
  81. .IR file2 .
  82. .TP
  83. .BI -j "n m"
  84. Archaic equivalent for
  85. .BI - "n m\f1."
  86. .TP
  87. .BI -o fields
  88. Each output line comprises the designated fields.
  89. The comma-separated field designators are either
  90. .BR 0 ,
  91. meaning the join field, or have the form
  92. .IR n . m ,
  93. where
  94. .I n
  95. is a file number and
  96. .I m
  97. is a field number.
  98. Archaic usage allows separate arguments for field designators.
  99. .TP
  100. .BI -t c
  101. Use character
  102. .I c
  103. as the only separator (tab character) on input and output.
  104. Every appearance of
  105. .I c
  106. in a line is significant.
  107. .SH EXAMPLES
  108. .TP
  109. .L
  110. sort -t: +1 /adm/users | join -t: -1 2 -a 1 -e "" - bdays
  111. Add birthdays to the
  112. .B /adm/users
  113. file, leaving unknown
  114. birthdays empty.
  115. The layout of
  116. .B /adm/users
  117. is given in
  118. .IR users (6);
  119. .B bdays
  120. contains sorted lines like
  121. .LR "ken:Feb\ 4,\ 1953" .
  122. .TP
  123. .L
  124. awk -F: '$3 != ""' /adm/users | tr : ' ' | sort -k 3,3 >temp
  125. .br
  126. .ns
  127. .TP
  128. .L
  129. join -1 3 -2 3 -o 1.1,2.1 temp temp | awk '$1 < $2'
  130. Print all pairs of users with identical non-empty userids.
  131. .SH SOURCE
  132. .B /sys/src/cmd/join.c
  133. .SH "SEE ALSO"
  134. .IR sort (1),
  135. .IR comm (1),
  136. .IR awk (1)
  137. .SH BUGS
  138. With default field separation,
  139. the collating sequence is that of
  140. .BI "sort -b"
  141. .BI -k y , y\f1;
  142. with
  143. .BR -t ,
  144. the sequence is that of
  145. .BI "sort -t" x
  146. .BI -k y , y\f1.
  147. .PP
  148. One of the files must be randomly accessible.