An extendable output function (XOF) is defined as a variable-length hash function on a message in which the output can be extended to any desired length.
At a minimum an XOF needs to support the following pseudo-code
xof = xof.new();
xof.absorb(bytes1);
xof.absorb(bytes2);
xof.finalize();
out1 = xof.squeeze(10);
out2 = xof.squeeze(1000);
The current OpenSSL implementation of XOF only supports a single call to squeeze. The assumption exists in both the high level call to EVP_DigestFinalXOF() as well as in the lower level SHA3_squeeze() operation (Of which there is a generic c version, as well as assembler code for different platforms).
A decision has to be made as to whether a new API is required, as well as considering how the change may affect existing applications. The changes introduced should have a minimal affect on other related functions that share the same code (e.g SHAKE and SHA3 share functionality). Older providers that have not been updated to support this change should produce an error if a newer core is used that supports multiple squeeze operations.
Currently EVP_DigestFinalXOF() uses a flag to check that it is only invoked once. It returns an error if called more than once. When initially written it also did a reset, but that code was removed as it was deemed to be incorrect.
If we remove the flag check, then the core code will potentially call low level squeeze code in a older provider that does not handle returning correct data for multiple calls. To counter this the provider needs a mechanism to indicate that multiple calls are allowed. This could just be a new gettable flag (having a separate provider function should not be necessary).
Change EVP_DigestFinalXOF(ctx, out, outlen) to handle multiple calls. Possibly have EVP_DigestSqueeze() just as an alias method? Changing the code at this level should be a simple matter of removing the flag check.
Keep EVP_DigestFinalXOF() as a one shot function and create a new API to handle the multi squeeze case e.g.
EVP_DigestSqueeze(ctx, out, outlen).
Create a completely new type e.g. EVP_XOF_MD to implement XOF digests
Currently OpenSSL only uses XOF's which use a sponge construction (which uses the terms absorb and squeeze). There will be other XOF's that do not use the sponge construction such as Blake2.
The proposed API name to use is EVP_DigestSqueeze. The alternate name suggested was EVP_DigestExtract. The terms extract and expand are used by HKDF so I think this name would be confusing.
The digest can be initialized as normal using:
md = EVP_MD_fetch(libctx, "SHAKE256", propq);
ctx = EVP_MD_CTX_new();
EVP_DigestInit_ex2(ctx, md, NULL);
Absorb can be done by multiple calls to:
EVP_DigestUpdate(ctx, in, inlen);
Do we want to have an Alias function?
EVP_DigestAbsorb(ctx, in, inlen);
(The consensus was that this is not required).
The finalize is just done as part of the squeeze operation.
A reset can be done by calling:
EVP_DigestInit_ex2(ctx, NULL, NULL);
The internal state can be copied by calling:
EVP_MD_CTX_copy_ex(ctx, newctx);
The existing one shot squeeze method is:
SHA3_squeeze(uint64_t A[5][5], unsigned char *out, size_t outlen, size_t r)
It contains an opaque object for storing the state B, that can be used to output to B. After every B bits, the state B is updated internally by calling KeccakF1600().
Unless you are using a multiple of B as the B, the function has no way of knowing where to start from if another call to SHA_squeeze() was attempted. The method also avoids doing a final call to KeccakF1600() currently since it was assumed that it was not required for a one shot operation.
Modify the SHA3_squeeze code to accept a input/output parameter to track the position within the state B. See https://github.com/openssl/openssl/pull/13470
Leave SHA3_squeeze() as it is and buffer calls to the SHA3_squeeze() function inside the final. See https://github.com/openssl/openssl/pull/7921
Perform a one-shot squeeze on the original absorbed data and throw away the first part of the output buffer,
An alternative approach to solution 2 is to modify the SHA3_squeeze() slightly so that it can pass in a boolean that handles the call to KeccakF1600() correctly for multiple calls.