Discussion about this post

User's avatar
Patrick B's avatar

I'm just thinking out loud here, but...

If the token window is limited, and we want to have a larger input + output than the window allows...

Could we use a sort of rolling "convolution" over the input in order to get an output that is larger than the window might allow?

EG:

Window: 8k tokens

Input: 20k tokens

Algo:

Parse the first 4k tokens of input, generate 2k tokens of output

Append 2k output to the next 4k tokens of input, generate 2k tokens of output

Repeat until the input has been fully parsed, or continue until the window is full with output tokens.

Expand full comment
Punit Thakkar's avatar

Your technical understanding of the AI world is really impressive. Been reading your work for a few months, thought I must share my appreciation for your tech chops. 🙏

Expand full comment
2 more comments...

No posts