microsoft-llmlingua.hf.space
Open in
urlscan Pro
3.214.109.71
Public Scan
Submitted URL: http://microsoft-llmlingua.hf.space/?__theme=light
Effective URL: https://microsoft-llmlingua.hf.space/?__theme=light
Submission: On March 27 via api from US — Scanned from DE
Effective URL: https://microsoft-llmlingua.hf.space/?__theme=light
Submission: On March 27 via api from US — Scanned from DE
Form analysis
0 forms found in the DOMText Content
LLMLINGUA: COMPRESSING PROMPTS FOR ACCELERATED INFERENCE OF LARGE LANGUAGE MODELS (EMNLP 2023) [PAPER] Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang and Lili Qiu THIS IS AN EARLY DEMO OF THE PROMPT COMPRESSION METHOD LLMLINGUA AND THE CAPABILITIES ARE LIMITED, RESTRICTED TO USING ONLY THE GPT-2 SMALL SIZE MODE. It should be noted that due to limited resources, we only provide the GPT2-Small size language model in this demo. Using the LLaMA2-7B as a small language model would result in a significant performance improvement, especially at high compression ratios. To use it, upload your prompt and set the compression target. 1. ✅ Set the different components of the prompt separately, including instruction, context, and question. Leave the corresponding field empty if a particular component does not exist. * Question: This refers to the directives given by the user to the LLMs, such as inquiries, questions, or requests. Positioned after the instruction and context modules, the question module has a high sensitivity to compression. * Context: This module provides the supplementary context needed to address the question, such as documents, demonstrations, web search results, or API call results. Located between the instruction and question modules, its sensitivity to compression is relatively low. * Instruction: This module consists of directives given by the user to the LLMs, such as task descriptions. Placed before the instruction and context modules, the instruction module exhibits a high sensitivity to compression. 2. ✅ Set the target_token or compression ratio. 3. 🤔 Try experimenting with different target compression ratios or other hyperparameters to optimize the performance. You can check our project page! We also has a work to compress long context scenories, using less cost but even improve the downstream performance, LongLLMLingua. LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression (Under Review). NEWS * 🦚 We're excited to announce the release of LLMLingua-2, boasting a 3x-6x speed improvement over LLMLingua! For more information, check out our paper, visit the project page, and explore our demo. * 👾 LLMLingua has been integrated into LangChain and LlamaIndex, two widely-used RAG frameworks. * 🤳 Talk slides are available in AI Time Jan, 24. * 🖥 EMNLP'23 slides are available in Session 5 and BoF-6. * 📚 Check out our new blog post discussing RAG benefits and cost savings through prompt compression. See the script example here. * 🎈 Visit our project page for real-world case studies in RAG, Online Meetings, CoT, and Code. * 👨🦯 Explore our './examples' directory for practical applications, including LLMLingua-2, RAG, Online Meeting, CoT, Code, and RAG using LlamaIndex. Prompts Instruction Context Question Compression Target Target Token (To use this, set Compression Ratio to 0) Compression Ratio (To use this, set Target Token to -1) Compress Prompt! Compressed Prompts compressed_prompt Saving The tokens number of original prompt The tokens number of compressed prompt Actual Compression Ratio Saving Cost EXAMPLES IN GSM8K Below are some examples of compressing prompts in GSM8K using different small language models. The original prompt [1] is taken from "Complexity-Based Prompting for Multi-step Reasoning" [2], with an original length of 2,365 tokens. Black-box LLMs use GPT-3.5-Turbo-0301 with greedy decoding. [1] https://github.com/FranxYao/chain-of-thought-hub/blob/main/gsm8k/lib_prompt/prompt_hardest.txt [2] Fu, Yao, et al. "Complexity-Based Prompting for Multi-step Reasoning." The Eleventh International Conference on Learning Representations. 2022. Small Language Model Compression Ratio GSM8K Acc using GPT-3.5-Turbo Compressed Prompts lgaalves/gpt2-dolly 13.8x 78.24 Question: can buy 4 1melon for You bought 36 fruits evenly split between of 1 $. does cost if bill $ 's think step If between 3 then I 363 = 12 of fruit 1 orange then oranges506If I oranges I $66 $60 on the 2 fruit the of is, and that price and is 1W4AIf we know we bought 12 and 12W thatW can the 12 = 48 of apple (60/The 1 : Sam a dozen boxes with 30ighter pens each Heanged into six3 the separately of three. much in 's boxes $120 12 =Sam then took 5 boxes × 6 highlighters/box = 30 highlighters. He sold these boxes for 5 * $3 = $15 After selling these 5 boxes there were 360 - 30 = 330 highlighters remaining. These form 330 / 3 = 110 groups of three pens. He sold each of these groups for $2 each, so made 110 * 2 = $220 from them. In total, then, he earned $220 + $15 = $235. Since his original cost was $120, he earned $235 - $120 = $115 in profit. The answer is 115 Small Language Model Compression Ratio GSM8K Acc using GPT-3.5-Turbo Compressed Prompts lgaalves/gpt2-dolly 8.7x 78.24 Question: can buy 4 1melon for You bought 36 fruits evenly split between of 1 $. does cost if bill $ 's think step If between 3 then I 363 = 12 of fruit 1 orange then oranges506If I oranges I $66 $60 on the 2 fruit the of is, and that price and is 1W4AIf we know we bought 12 and 12W thatW can the 12 = 48 of apple (60/The 1 : Sam a dozen boxes with 30ighter pens each Heanged into six3 the separately of three. much in 's boxes $120 12 =Sam then took 5 boxes × 6 highlighters/box = 30 highlighters. He sold these boxes for 5 * $3 = $15 After selling these 5 boxes there were 360 - 30 = 330 highlighters remaining. These form 330 / 3 = 110 groups of three pens. He sold each of these groups for $2 each, so made 110 * 2 = $220 from them. In total, then, he earned $220 + $15 = $235. Since his original cost was $120, he earned $235 - $120 = $115 in profit. The answer is 115 vicgalle/alpaca-7b 13.8x 78.32 Question: Sam bought a dozen boxes, each 30 highl pens inside, $10 each. He reanged five of boxes into of sixlters each sold $3. He sold the theters separately at the of three $2. How much did make in total, in Lets think step Sam bought boxes x0 = $10 oflters. He 2 300ters in Sam then 5 boxes 6ters0ters He sold these boxes for 55 Afterelling these boxes there300lters remaining These form 330 310 of three pens He sold each of these groups for2 each, so made 0 *0 from In total, he $ $155 Since his original $1, he earned $20 = $115 in profit. The answer is 115 vicgalle/alpaca-7b 20.2x 77.94 Question: Sam bought a dozen boxes, each with 30 highl pens inside, for $10 each. He reanged five of boxes into of sixlters each sold them $3 per package. He sold the rest of thelters separately at the of three pens for $2. How much profit did make in total, in dollars Let's think step by step Sam then took 5 boxes × 6lighters/box = 30 highlighters. These form 330 / 3 = 110 groups of three pens. The answer is 115 Built with Gradio