Browse Source

Inline URL preview documentation. (#13261)

Inline URL preview documentation near the implementation.
Patrick Cloke 1 year ago
parent
commit
1381563988

+ 1 - 2
changelog.d/13233.doc

@@ -1,2 +1 @@
-Add a link to configuration instructions in the URL preview documentation.
-
+Move the documentation for how URL previews work to the URL preview module.

+ 1 - 0
changelog.d/13261.doc

@@ -0,0 +1 @@
+Move the documentation for how URL previews work to the URL preview module.

+ 0 - 1
docs/SUMMARY.md

@@ -35,7 +35,6 @@
     - [Application Services](application_services.md)
     - [Server Notices](server_notices.md)
     - [Consent Tracking](consent_tracking.md)
-    - [URL Previews](development/url_previews.md)
     - [User Directory](user_directory.md)
     - [Message Retention Policies](message_retention_policies.md)
     - [Pluggable Modules](modules/index.md)

+ 1 - 1
docs/admin_api/user_admin_api.md

@@ -544,7 +544,7 @@ Gets a list of all local media that a specific `user_id` has created.
 These are media that the user has uploaded themselves
 ([local media](../media_repository.md#local-media)), as well as
 [URL preview images](../media_repository.md#url-previews) requested by the user if the
-[feature is enabled](../development/url_previews.md).
+[feature is enabled](../usage/configuration/config_documentation.md#url_preview_enabled).
 
 By default, the response is ordered by descending creation date and ascending media ID.
 The newest media is on top. You can change the order with parameters

+ 0 - 62
docs/development/url_previews.md

@@ -1,62 +0,0 @@
-URL Previews
-============
-For information on how to enable URL previews in synapse, please see the [config manual](../usage/configuration/config_documentation.md#url_preview_enabled).
-
-The `GET /_matrix/media/r0/preview_url` endpoint provides a generic preview API
-for URLs which outputs [Open Graph](https://ogp.me/) responses (with some Matrix
-specific additions).
-
-This does have trade-offs compared to other designs:
-
-* Pros:
-  * Simple and flexible; can be used by any clients at any point
-* Cons:
-  * If each homeserver provides one of these independently, all the HSes in a
-    room may needlessly DoS the target URI
-  * The URL metadata must be stored somewhere, rather than just using Matrix
-    itself to store the media.
-  * Matrix cannot be used to distribute the metadata between homeservers.
-
-When Synapse is asked to preview a URL it does the following:
-
-1. Checks against a URL blacklist (defined as `url_preview_url_blacklist` in the
-   config).
-2. Checks the in-memory cache by URLs and returns the result if it exists. (This
-   is also used to de-duplicate processing of multiple in-flight requests at once.)
-3. Kicks off a background process to generate a preview:
-   1. Checks the database cache by URL and timestamp and returns the result if it
-      has not expired and was successful (a 2xx return code).
-   2. Checks if the URL matches an [oEmbed](https://oembed.com/) pattern. If it
-      does, update the URL to download.
-   3. Downloads the URL and stores it into a file via the media storage provider
-      and saves the local media metadata.
-   4. If the media is an image:
-      1. Generates thumbnails.
-      2. Generates an Open Graph response based on image properties.
-   5. If the media is HTML:
-      1. Decodes the HTML via the stored file.
-      2. Generates an Open Graph response from the HTML.
-      3. If a JSON oEmbed URL was found in the HTML via autodiscovery:
-         1. Downloads the URL and stores it into a file via the media storage provider
-            and saves the local media metadata.
-         2. Convert the oEmbed response to an Open Graph response.
-         3. Override any Open Graph data from the HTML with data from oEmbed.
-      4. If an image exists in the Open Graph response:
-         1. Downloads the URL and stores it into a file via the media storage
-            provider and saves the local media metadata.
-         2. Generates thumbnails.
-         3. Updates the Open Graph response based on image properties.
-   6. If the media is JSON and an oEmbed URL was found:
-      1. Convert the oEmbed response to an Open Graph response.
-      2. If a thumbnail or image is in the oEmbed response:
-         1. Downloads the URL and stores it into a file via the media storage
-            provider and saves the local media metadata.
-         2. Generates thumbnails.
-         3. Updates the Open Graph response based on image properties.
-   7. Stores the result in the database cache.
-4. Returns the result.
-
-The in-memory cache expires after 1 hour.
-
-Expired entries in the database cache (and their associated media files) are
-deleted every 10 seconds. The default expiration time is 1 hour from download.

+ 1 - 4
docs/media_repository.md

@@ -7,8 +7,7 @@ The media repository
    users.
  * caches avatars, attachments and their thumbnails for media uploaded by remote
    users.
- * caches resources and thumbnails used for
-   [URL previews](development/url_previews.md).
+ * caches resources and thumbnails used for URL previews.
 
 All media in Matrix can be identified by a unique
 [MXC URI](https://spec.matrix.org/latest/client-server-api/#matrix-content-mxc-uris),
@@ -59,8 +58,6 @@ remote_thumbnail/matrix.org/aa/bb/cccccccccccccccccccc/128-96-image-jpeg
 Note that `remote_thumbnail/` does not have an `s`.
 
 ## URL Previews
-See [URL Previews](development/url_previews.md) for documentation on the URL preview
-process.
 
 When generating previews for URLs, Synapse may download and cache various
 resources, including images. These resources are assigned temporary media IDs

+ 58 - 4
synapse/rest/media/v1/preview_url_resource.py

@@ -109,10 +109,64 @@ class MediaInfo:
 
 class PreviewUrlResource(DirectServeJsonResource):
     """
-    Generating URL previews is a complicated task which many potential pitfalls.
-
-    See docs/development/url_previews.md for discussion of the design and
-    algorithm followed in this module.
+    The `GET /_matrix/media/r0/preview_url` endpoint provides a generic preview API
+    for URLs which outputs Open Graph (https://ogp.me/) responses (with some Matrix
+    specific additions).
+
+    This does have trade-offs compared to other designs:
+
+    * Pros:
+      * Simple and flexible; can be used by any clients at any point
+    * Cons:
+      * If each homeserver provides one of these independently, all the homeservers in a
+        room may needlessly DoS the target URI
+      * The URL metadata must be stored somewhere, rather than just using Matrix
+        itself to store the media.
+      * Matrix cannot be used to distribute the metadata between homeservers.
+
+    When Synapse is asked to preview a URL it does the following:
+
+    1. Checks against a URL blacklist (defined as `url_preview_url_blacklist` in the
+       config).
+    2. Checks the URL against an in-memory cache and returns the result if it exists. (This
+       is also used to de-duplicate processing of multiple in-flight requests at once.)
+    3. Kicks off a background process to generate a preview:
+       1. Checks URL and timestamp against the database cache and returns the result if it
+          has not expired and was successful (a 2xx return code).
+       2. Checks if the URL matches an oEmbed (https://oembed.com/) pattern. If it
+          does, update the URL to download.
+       3. Downloads the URL and stores it into a file via the media storage provider
+          and saves the local media metadata.
+       4. If the media is an image:
+          1. Generates thumbnails.
+          2. Generates an Open Graph response based on image properties.
+       5. If the media is HTML:
+          1. Decodes the HTML via the stored file.
+          2. Generates an Open Graph response from the HTML.
+          3. If a JSON oEmbed URL was found in the HTML via autodiscovery:
+             1. Downloads the URL and stores it into a file via the media storage provider
+                and saves the local media metadata.
+             2. Convert the oEmbed response to an Open Graph response.
+             3. Override any Open Graph data from the HTML with data from oEmbed.
+          4. If an image exists in the Open Graph response:
+             1. Downloads the URL and stores it into a file via the media storage
+                provider and saves the local media metadata.
+             2. Generates thumbnails.
+             3. Updates the Open Graph response based on image properties.
+       6. If the media is JSON and an oEmbed URL was found:
+          1. Convert the oEmbed response to an Open Graph response.
+          2. If a thumbnail or image is in the oEmbed response:
+             1. Downloads the URL and stores it into a file via the media storage
+                provider and saves the local media metadata.
+             2. Generates thumbnails.
+             3. Updates the Open Graph response based on image properties.
+       7. Stores the result in the database cache.
+    4. Returns the result.
+
+    The in-memory cache expires after 1 hour.
+
+    Expired entries in the database cache (and their associated media files) are
+    deleted every 10 seconds. The default expiration time is 1 hour from download.
     """
 
     isLeaf = True