├── autotest
├── bnchnarks.here.md
├── promptLibv2Qwen.py
├── 101.Falcon3-1b-instruct_llamaserver_noHistoryCHAT.py
└── 100.Falcon3-1b-instruct_llamaserver_promptTest.py
├── image001.png
├── image002.png
├── image003.png
├── falconBKG.png
├── falco3-eval.png
├── falco3-logo.png
├── BattleOf2B-2024.png
├── falcon3_llamacpp.gif
├── 202412.BattleOfTheModels-2B_tableDetailed.xlsx
├── Falcon3-1B_LCPPsvr_RYGOX.csv
├── README.md
├── promptLibv2Falcon3.py
├── testFalcon3-1B-it.py
├── 101.Falcon3-1b-instruct_llamaserver_noHistoryCHAT.py
├── 100.Falcon3-1b-instruct_llamaserver_promptTest.py
└── Falcon3-1B_LCPPsvr_RYGOX_log.txt
/autotest/bnchnarks.here.md:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/image001.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fabiomatricardi/Falcon3-1B-it-llamaCPP/main/image001.png
--------------------------------------------------------------------------------
/image002.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fabiomatricardi/Falcon3-1B-it-llamaCPP/main/image002.png
--------------------------------------------------------------------------------
/image003.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fabiomatricardi/Falcon3-1B-it-llamaCPP/main/image003.png
--------------------------------------------------------------------------------
/falconBKG.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fabiomatricardi/Falcon3-1B-it-llamaCPP/main/falconBKG.png
--------------------------------------------------------------------------------
/falco3-eval.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fabiomatricardi/Falcon3-1B-it-llamaCPP/main/falco3-eval.png
--------------------------------------------------------------------------------
/falco3-logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fabiomatricardi/Falcon3-1B-it-llamaCPP/main/falco3-logo.png
--------------------------------------------------------------------------------
/BattleOf2B-2024.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fabiomatricardi/Falcon3-1B-it-llamaCPP/main/BattleOf2B-2024.png
--------------------------------------------------------------------------------
/falcon3_llamacpp.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fabiomatricardi/Falcon3-1B-it-llamaCPP/main/falcon3_llamacpp.gif
--------------------------------------------------------------------------------
/202412.BattleOfTheModels-2B_tableDetailed.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/fabiomatricardi/Falcon3-1B-it-llamaCPP/main/202412.BattleOfTheModels-2B_tableDetailed.xlsx
--------------------------------------------------------------------------------
/Falcon3-1B_LCPPsvr_RYGOX.csv:
--------------------------------------------------------------------------------
1 | #,TASK,VOTE,REMARKS
2 | 1,introduction,5,
3 | 2,explain in one sentence,5,
4 | 3,explain in three paragraphs,5,
5 | 4,say 'I am ready',5,
6 | 5,summarize,5,
7 | 6,Summarize in two sentences,3 ,there are 3 prhases
8 | 7,Write in a list the three main key points - format output,4,
9 | 8,Table of Contents,5,
10 | 9,RAG,5,
11 | 10,Truthful RAG,2 ,"it is unanswerable, even after a lot of prompt engineering"
12 | 11,write content from a reference,5,
13 | 12,extract 5 topics,5,
14 | 13,Creativity: 1000 words SF story,4 ,"good, no repetitions, consistent. but not matching the lenght"
15 | 14,Reflection prompt,3 ,"regardless the missing opening/closure tags, the structure of the think-reflect-output is not fulfilled"
16 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 | # Falcon3-1B-it-llamaCPP
4 | How to run Falcon3-1B-instruct with llama.cpp on your pc - this is what we will build
5 |
6 | For months I have done the wrong things. But llama.cpp now has pre-compiled binaries at every release. So, for instance, starting from revision b4351 llama.cpp supports also the Falcon3 models.
7 |
8 | To be inclusive (all kind of hardware) we will use the binaries for AXV2 support, from release b4358 (the one available at the time of this newsletter writing).
9 |
10 | Download the file in your project directory: for me is Falcon3. Create a sub-folder called llamacpp and inside another one called model (we will download the GGF for Falcon3 there).
11 |
12 |
13 |
14 | Unzip all files in the [llama-b4358-bin-win-avx2-x64.zip](https://github.com/ggerganov/llama.cpp/releases/download/b4358/llama-b4358-bin-win-avx2-x64.zip) archive into the llamacpp directory
15 |
16 |
17 |
18 | Download the from the MaziyarPanahi Hugging Face repository: I used the Q6 ([Falcon3-1B-Instruct.Q6_K.gguf](https://huggingface.co/MaziyarPanahi/Falcon3-1B-Instruct-GGUF/resolve/main/Falcon3-1B-Instruct.Q6_K.gguf)) quantization, but also the Q8 is good. Save the GGUF file in the subdirectory llamacpp\model.
19 |
20 |
21 |
22 | Open a terminal window in the subdirectory llamacpp, and run
23 | ```
24 | .\llama-server.exe -m .\model\Falcon3-1B-Instruct.Q6_K.gguf -c 8192 --port 8001
25 | ```
26 |
27 | In another terminal with the venv activated run
28 | ```
29 | python testFalcon3-1B-it.py
30 | ```
31 |
32 |
--------------------------------------------------------------------------------
/autotest/promptLibv2Qwen.py:
--------------------------------------------------------------------------------
1 | """
2 | V2 changes
3 | added Time To First Token in the statistics ttft
4 | added some more prompts in the catalog
5 | - say 'I am ready'
6 | - modified for Llama3.2-1b Write in a list the three main key points - format output
7 |
8 | 20240929 FAMA
9 | """
10 |
11 | import random
12 | import string
13 | import tiktoken
14 |
15 | def createCatalog():
16 | """
17 | Create a dictionary with
18 | 'task' : description of the NLP task in the prompt
19 | 'prompt' : the instruction prompt for the LLM
20 | """
21 | context = """One of the things everybody in the West knows about China is that it is not a democracy, and is instead a regime run with an iron fist by a single entity, the Chinese Communist Party, whose leadership rarely acts transparently, running the country without the need for primary elections, alternative candidacies, etc.
22 | In general, those of us who live in democracies, with relatively transparent electoral processes, tend to consider the Chinese system undesirable, little more than a dictatorship where people have no say in who governs them.
23 | That said, among the “advantages” of the Chinese system is that because the leadership never has to put its legitimacy to the vote, it can carry out very long-term planning in the knowledge that another administration isn’t going to come along and change those plans.
24 | Obviously, I put “advantages” in quotation marks because, as democrats, most of my readers would never be willing to sacrifice their freedom for greater planning, but there is no doubt that China, since its system works like this and its population seems to have accepted it for generations, intends to turn this into a comparative advantage, the term used in business when analyzing companies.
25 | It turns out that China’s capacity for long-term planning is achieving something unheard of in the West: it seems the country reached peak carbon dioxide and greenhouse gas emissions in 2023, and that the figures for 2024, driven above all by a determined increase in the installation of renewable energies, are not only lower, but apparently going to mark a turning point.
26 | China and India were until recently the planet’s biggest polluters, but they now offer a model for energy transition (there is still a long way to go; but we are talking about models, not a done deal).
27 | It could soon be the case that the so-called developing countries will be showing the West the way forward."""
28 | catalog = []
29 | prmpt_tasks = ["introduction",
30 | "explain in one sentence",
31 | "explain in three paragraphs",
32 | "say 'I am ready'",
33 | "summarize",
34 | "Summarize in two sentences",
35 | "Write in a list the three main key points - format output",
36 | "Table of Contents",
37 | "RAG",
38 | "Truthful RAG",
39 | "write content from a reference",
40 | "extract 5 topics",
41 | "Creativity: 1000 words SF story",
42 | "Reflection prompt"
43 | ]
44 | prmpt_coll = [
45 | """Hi there I am Fabio, a Medium writer. who are you?""",
46 | """explain in one sentence what is science.\n""",
47 | """explain only in three paragraphs what is artificial intelligence.\n""",
48 | f"""read the following text and when you are done say "I am ready".
49 |
50 | [text]
51 | {context}
52 | [end of text]
53 |
54 | """,
55 | f"""summarize the following text:
56 | [text]
57 | {context}
58 | [end of text]
59 | write the summary using the information provided.
60 |
61 | Summary:
62 | """,
63 | f"""Write a two-sentences summary of the following text:
64 | [text]
65 | {context}
66 | [end of text]
67 | Remember: write only the summary with the text provided.
68 | """,
69 | f"""1. extract the three key points from the provided text
70 | 2. format the output as a python list ["point 1","point 2", "point 3"]
71 | [text]
72 | {context}
73 | [end of text]
74 | Return only the python list.
75 | """,
76 | f"""A "table of content" provides an organized structure to the text, allowing readers quick access to key points. Write the "table of content" of the following text.
77 | [text]
78 | {context}
79 | [end of text]
80 |
81 | "table of content":
82 | """,
83 | f"""Reply to the question only using the provided context. If the answer is not contained in the text say "unanswerable".
84 |
85 | question: what China achieved with it's long-term planning?
86 |
87 | [context]
88 | {context}
89 | [end of context]
90 |
91 | answer:
92 | """,
93 | f"""Reply to the question only using the provided context. If the answer is not contained in the provided context say "unanswerable".
94 |
95 | question: who is Anne Frank?
96 |
97 | [context]
98 | {context}
99 | [end of context]
100 |
101 | Remember: if you cannot answer based on the provided context, say "unanswerable"
102 |
103 | answer:
104 | """,
105 |
106 | f"""Using the following text as a reference, write a 5-paragraphs essay about "the benefits of China economic model".
107 |
108 | [text]
109 | {context}
110 | [end of text]
111 | Remember: use the information provided and write exactly 5 paragraphs.
112 | """,
113 | f"""List five most important topics from the following text:
114 | [text]
115 | {context}
116 | [end of text]
117 | """,
118 | """Science Fiction: The Last Transmission - Write a story that takes place entirely within a spaceship's cockpit as the sole surviving crew member attempts to send a final message back to Earth before the ship's power runs out. The story should explore themes of isolation, sacrifice, and the importance of human connection in the face of adversity. 800-1000 words.
119 |
120 | """,
121 | """You are an AI assistant designed to provide detailed, step-by-step responses. Your outputs should follow this structure:
122 | 1. Begin with a section.
123 | 2. Inside the thinking section:
124 | a. Briefly analyze the question and outline your approach.
125 | b. Present a clear plan of steps to solve the problem.
126 | c. Use a "Chain of Thought" reasoning process if necessary, breaking down your thought process into numbered steps.
127 | 3. Include a section for each idea where you:
128 | a. Review your reasoning.
129 | b. Check for potential errors or oversights.
130 | c. Confirm or adjust your conclusion if necessary.
131 | 4. Be sure to close all reflection sections.
132 | 5. Close the thinking section with .
133 | 6. Provide your final answer in an