webfs 6.4 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308
  1. .TH WEBFS 4
  2. .SH NAME
  3. webfs \- world wide web file system
  4. .SH SYNOPSIS
  5. .B webfs
  6. [
  7. .B -c
  8. .I cookiefile
  9. ]
  10. [
  11. .B -m
  12. .I mtpt
  13. ]
  14. [
  15. .B -s
  16. .I service
  17. ]
  18. .SH DESCRIPTION
  19. .I Webfs
  20. presents a file system interface to the parsing and retrieving
  21. of URLs.
  22. .I Webfs
  23. mounts itself at
  24. .I mtpt
  25. (default
  26. .BR /mnt/web ),
  27. and, if
  28. .I service
  29. is specified, will post a service file descriptor
  30. in
  31. .BR /srv/\fIservice .
  32. .PP
  33. .I Webfs
  34. presents a three-level file system suggestive
  35. of the network protocol hierarchies
  36. .IR ip (3)
  37. and
  38. .IR ether (3).
  39. .PP
  40. The top level contains three files:
  41. .BR ctl ,
  42. .BR cookies ,
  43. and
  44. .BR clone .
  45. .PP
  46. The
  47. .B ctl
  48. file is used to maintain parameters global to the instance of
  49. .IR webfs .
  50. Reading the
  51. .B ctl
  52. file yields the current values of the parameters.
  53. Writing strings of the form
  54. .RB `` attr " " value ''
  55. sets a particular attribute.
  56. Attributes are:
  57. .TP
  58. .B chatty9p
  59. The
  60. .B chatty9p
  61. flag used by the 9P library, discussed in
  62. .IR 9p (2).
  63. .B 0
  64. is no debugging,
  65. .B 1
  66. prints 9P message traces on standard error,
  67. and values above
  68. .B 1
  69. present more debugging, at the whim of the library.
  70. The default for this and the following debug flags is
  71. .BR 0 .
  72. .TP
  73. .B fsdebug
  74. This variable is the level of debugging output about the file system module.
  75. .TP
  76. .B cookiedebug
  77. This variable is the level of debugging output about the cookie module.
  78. .TP
  79. .B urldebug
  80. This variable is the level of debugging output about URL parsing.
  81. .TP
  82. .B acceptcookies
  83. This flag controls whether to accept cookies presented by remote web servers.
  84. (Cookies are described below, in the discussion of the
  85. .B cookies
  86. file.)
  87. The values
  88. .B on
  89. and
  90. .B off
  91. are synonymous with
  92. .B 1
  93. and
  94. .BR 0 .
  95. The default is
  96. .BR on .
  97. .TP
  98. .B sendcookies
  99. This flag controls whether to present stored cookies to remote web servers.
  100. The default is
  101. .BR on .
  102. .TP
  103. .B redirectlimit
  104. Web servers can respond to a request with a message
  105. redirecting to another page.
  106. .I Webfs
  107. makes no effort to determine whether it is in an infinite
  108. redirect loop.
  109. Instead, it gives up after this many redirects.
  110. The default is
  111. .BR 10 .
  112. .TP
  113. .B useragent
  114. .I Webfs
  115. sends the value of this attribute in its
  116. .B User-Agent:
  117. header in its HTTP requests.
  118. The default is
  119. .RB `` "webfs/2.0 (plan 9)" .''
  120. .PD
  121. .PP
  122. The top-level directory also contains
  123. numbered directories corresponding to connections, which
  124. may be used to fetch a single URL.
  125. To allocate a connection, open the
  126. .B clone
  127. file and read a number
  128. .I n
  129. from it.
  130. After opening, the
  131. .B clone
  132. file is equivalent to the file
  133. .IB n /ctl \fR.
  134. A connection is assumed closed once all files in its directory
  135. have been closed, and is then will be reallocated.
  136. .PP
  137. Each connection has its own private set of
  138. .BR acceptcookies ,
  139. .BR sendcookies ,
  140. .BR redirectlimit ,
  141. and
  142. .B useragent
  143. variables, initialized to the defaults set in the
  144. root's
  145. .B ctl
  146. file. The per-connection
  147. .B ctl
  148. file allows editing the variables for this particular connection.
  149. .PP
  150. Each connection also has a URL string variable
  151. .B url
  152. associated with it.
  153. This URL may be an absolute URL such as
  154. .I http://www.lucent.com/index.html
  155. or a relative URL such as
  156. .IR ../index.html .
  157. The
  158. .B baseurl
  159. string variable sets the URL against which relative URLs
  160. are interpreted.
  161. Once the URL has been set,
  162. its pieces can be retrieved via individual files in the
  163. .B parsed
  164. directory.
  165. .I Webfs
  166. parses the following URL syntaxes; names in italics are
  167. the names of files in the
  168. .B parsed
  169. directory.
  170. .IP
  171. \fIscheme\f5:\fIschemedata
  172. .br
  173. \f5http://\fIhost\f5/\fIpath\fR[\f5?\fIquery\fR][\f5#\fIfragment\fR]
  174. .br
  175. \f5ftp://\fR[\fIuser\fR[\f5:\fIpassword\fR]\f5@\fR]\fP\f5\fIhost\f5/\fIpath\fR[\f5;type=\fIftptype\fR]
  176. .br
  177. \f5file:\fIpath
  178. .LP
  179. If there is associated data to be
  180. posted with the request, it can be written to
  181. .BR postbody .
  182. Finally, opening
  183. .B body
  184. initiates the request.
  185. The resulting data may be read from
  186. .B body
  187. as it arrives.
  188. After the request has been executed, the MIME content type
  189. may be read from the
  190. .B contenttype
  191. file.
  192. .PP
  193. The top-level
  194. .B cookies
  195. file contains the internal set of HTTP cookies, which
  196. are used by HTTP servers to associate requests with persistent
  197. state such as user profiles.
  198. It may be edited as an ordinary text file.
  199. Multiple instances of
  200. .I webfs
  201. and
  202. .IR webcookies (4)
  203. share cookies by keeping their internal set
  204. consistent with the
  205. .I cookiefile
  206. (default
  207. .BR $home/lib/webcookies ),
  208. which has the same format.
  209. .PP
  210. These files contain one line per cookie;
  211. each cookie comprises some number of
  212. .IB attr = value
  213. pairs.
  214. Cookie attributes are:
  215. .TP
  216. .BI name= name
  217. The name of the cookie on the remote server.
  218. .TP
  219. .BI value= value
  220. The value associated with that name on the remote server.
  221. The actual data included when a cookie is sent back
  222. to the server is
  223. .IB \fR``\fIname = value\fR''
  224. (where, confusingly,
  225. .I name
  226. and
  227. .I value
  228. are the values associated with the
  229. .B name
  230. and
  231. .B value
  232. attributes.
  233. .TP
  234. .BI domain= domain
  235. If
  236. .I domain
  237. is an IP address, the cookie can only be used for URLs
  238. with
  239. .I host
  240. equal to that IP address.
  241. Otherwise,
  242. .I domain
  243. must be a pattern beginning with a dot, and
  244. the cookie can only be used for URLs with a
  245. .I host
  246. having
  247. .I domain
  248. as a suffix.
  249. For example, a cookie with
  250. .B domain=.bell-labs.com
  251. may be used on hosts
  252. .I www.bell-labs.com
  253. and
  254. .IR www.research.bell-labs.com
  255. (but not
  256. .IR www.not-bell-labs.com ).
  257. .TP
  258. .BI path= path
  259. The cookie can only be used for URLs with a path
  260. beginning with
  261. .IR path .
  262. .TP
  263. .BI version= version
  264. The version of the HTTP cookie specification, specified by the server.
  265. .TP
  266. .BI comment= comment
  267. A comment, specified by the server.
  268. .TP
  269. .BI expire= expire
  270. The cookie expires at time
  271. .IR expire ,
  272. which is a decimal number of seconds since the epoch.
  273. .TP
  274. .B secure=1
  275. The cookie may only be used over secure
  276. .RB ( https )
  277. connections.
  278. Secure connections are currently unimplemented.
  279. .TP
  280. .B explicitdomain=1
  281. The domain associated with this cookie was set by
  282. the server (rather than inferred from a URL).
  283. .TP
  284. .B explicitpath=1
  285. The path associated with this cookie was set by the
  286. server (rather than inferred from a URL).
  287. .TP
  288. .B netscapestyle=1
  289. The server presented the cookie in ``Netscape style,'' which
  290. does not conform to the cookie standard, RFC2109.
  291. It is assumed that when presenting the cookie to the server,
  292. it must be sent back in Netscape style as well.
  293. .PD
  294. .SH EXAMPLE
  295. .B /sys/src/cmd/webfs/webget.c
  296. is a simple client.
  297. .SH SOURCE
  298. .B /sys/src/cmd/webfs
  299. .SH SEE ALSO
  300. .IR hget (1),
  301. .IR webcookies (4)
  302. .SH BUGS
  303. It's not clear what the relationship between
  304. .IR hget ,
  305. .I webcookies
  306. and
  307. .I webfs
  308. should be.