microsoft-llmlingua.hf.space Open in urlscan Pro
3.214.109.71  Public Scan

Submitted URL: http://microsoft-llmlingua.hf.space/?__theme=light
Effective URL: https://microsoft-llmlingua.hf.space/?__theme=light
Submission: On March 27 via api from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

LLMLINGUA: COMPRESSING PROMPTS FOR ACCELERATED INFERENCE OF LARGE LANGUAGE
MODELS (EMNLP 2023) [PAPER]

Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang and Lili Qiu


THIS IS AN EARLY DEMO OF THE PROMPT COMPRESSION METHOD LLMLINGUA AND THE
CAPABILITIES ARE LIMITED, RESTRICTED TO USING ONLY THE GPT-2 SMALL SIZE MODE.

It should be noted that due to limited resources, we only provide the GPT2-Small
size language model in this demo. Using the LLaMA2-7B as a small language model
would result in a significant performance improvement, especially at high
compression ratios.

To use it, upload your prompt and set the compression target.

 1. ✅ Set the different components of the prompt separately, including
    instruction, context, and question. Leave the corresponding field empty if a
    particular component does not exist.
    * Question: This refers to the directives given by the user to the LLMs,
      such as inquiries, questions, or requests. Positioned after the
      instruction and context modules, the question module has a high
      sensitivity to compression.
    * Context: This module provides the supplementary context needed to address
      the question, such as documents, demonstrations, web search results, or
      API call results. Located between the instruction and question modules,
      its sensitivity to compression is relatively low.
    * Instruction: This module consists of directives given by the user to the
      LLMs, such as task descriptions. Placed before the instruction and context
      modules, the instruction module exhibits a high sensitivity to
      compression.
 2. ✅ Set the target_token or compression ratio.
 3. 🤔 Try experimenting with different target compression ratios or other
    hyperparameters to optimize the performance.

You can check our project page!

We also has a work to compress long context scenories, using less cost but even
improve the downstream performance, LongLLMLingua.
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via
Prompt Compression (Under Review).



NEWS

 * 🦚 We're excited to announce the release of LLMLingua-2, boasting a 3x-6x
   speed improvement over LLMLingua! For more information, check out our paper,
   visit the project page, and explore our demo.
 * 👾 LLMLingua has been integrated into LangChain and LlamaIndex, two
   widely-used RAG frameworks.
 * 🤳 Talk slides are available in AI Time Jan, 24.
 * 🖥 EMNLP'23 slides are available in Session 5 and BoF-6.
 * 📚 Check out our new blog post discussing RAG benefits and cost savings
   through prompt compression. See the script example here.
 * 🎈 Visit our project page for real-world case studies in RAG, Online
   Meetings, CoT, and Code.
 * 👨‍🦯 Explore our './examples' directory for practical applications,
   including LLMLingua-2, RAG, Online Meeting, CoT, Code, and RAG using
   LlamaIndex.

Prompts
Instruction
Context
Question
Compression Target
Target Token (To use this, set Compression Ratio to 0)
Compression Ratio (To use this, set Target Token to -1)
Compress Prompt!
Compressed Prompts
compressed_prompt
Saving
The tokens number of original prompt
The tokens number of compressed prompt
Actual Compression Ratio
Saving Cost


EXAMPLES IN GSM8K

Below are some examples of compressing prompts in GSM8K using different small
language models. The original prompt [1] is taken from "Complexity-Based
Prompting for Multi-step Reasoning" [2], with an original length of 2,365
tokens. Black-box LLMs use GPT-3.5-Turbo-0301 with greedy decoding.

[1]
https://github.com/FranxYao/chain-of-thought-hub/blob/main/gsm8k/lib_prompt/prompt_hardest.txt
[2] Fu, Yao, et al. "Complexity-Based Prompting for Multi-step Reasoning." The
Eleventh International Conference on Learning Representations. 2022.

Small Language Model

Compression Ratio

GSM8K Acc using GPT-3.5-Turbo

Compressed Prompts

lgaalves/gpt2-dolly
13.8x
78.24
Question: can buy 4 1melon for You bought 36 fruits evenly split between of 1 $.
does cost if bill $ 's think step If between 3 then I 363 = 12 of fruit 1 orange
then oranges506If I oranges I $66 $60 on the 2 fruit the of is, and that price
and is 1W4AIf we know we bought 12 and 12W thatW can the 12 = 48 of apple
(60/The 1 : Sam a dozen boxes with 30ighter pens each Heanged into six3 the
separately of three. much in 's boxes $120 12 =Sam then took 5 boxes × 6
highlighters/box = 30 highlighters. He sold these boxes for 5 * $3 = $15 After
selling these 5 boxes there were 360 - 30 = 330 highlighters remaining. These
form 330 / 3 = 110 groups of three pens. He sold each of these groups for $2
each, so made 110 * 2 = $220 from them. In total, then, he earned $220 + $15 =
$235. Since his original cost was $120, he earned $235 - $120 = $115 in profit.
The answer is 115

Small Language Model

Compression Ratio

GSM8K Acc using GPT-3.5-Turbo

Compressed Prompts

lgaalves/gpt2-dolly
8.7x
78.24
Question: can buy 4 1melon for You bought 36 fruits evenly split between of 1 $.
does cost if bill $ 's think step If between 3 then I 363 = 12 of fruit 1 orange
then oranges506If I oranges I $66 $60 on the 2 fruit the of is, and that price
and is 1W4AIf we know we bought 12 and 12W thatW can the 12 = 48 of apple
(60/The 1 : Sam a dozen boxes with 30ighter pens each Heanged into six3 the
separately of three. much in 's boxes $120 12 =Sam then took 5 boxes × 6
highlighters/box = 30 highlighters. He sold these boxes for 5 * $3 = $15 After
selling these 5 boxes there were 360 - 30 = 330 highlighters remaining. These
form 330 / 3 = 110 groups of three pens. He sold each of these groups for $2
each, so made 110 * 2 = $220 from them. In total, then, he earned $220 + $15 =
$235. Since his original cost was $120, he earned $235 - $120 = $115 in profit.
The answer is 115
vicgalle/alpaca-7b
13.8x
78.32
Question: Sam bought a dozen boxes, each 30 highl pens inside, $10 each. He
reanged five of boxes into of sixlters each sold $3. He sold the theters
separately at the of three $2. How much did make in total, in Lets think step
Sam bought boxes x0 = $10 oflters. He 2 300ters in Sam then 5 boxes 6ters0ters
He sold these boxes for 55 Afterelling these boxes there300lters remaining These
form 330 310 of three pens He sold each of these groups for2 each, so made 0 *0
from In total, he $ $155 Since his original $1, he earned $20 = $115 in profit.
The answer is 115
vicgalle/alpaca-7b
20.2x
77.94
Question: Sam bought a dozen boxes, each with 30 highl pens inside, for $10
each. He reanged five of boxes into of sixlters each sold them $3 per package.
He sold the rest of thelters separately at the of three pens for $2. How much
profit did make in total, in dollars Let's think step by step Sam then took 5
boxes × 6lighters/box = 30 highlighters. These form 330 / 3 = 110 groups of
three pens. The answer is 115

Built with Gradio