Submitted by dojoteef t3_11qfcwb in MachineLearning

According to the authors, the model performs on par with text-davinci-003 in a small scale human study (the five authors of the paper rated model outputs), despite the Alpaca 7B model being much smaller than text-davinci-003. Read the blog post for details.

Blog post: https://crfm.stanford.edu/2023/03/13/alpaca.html Demo: https://crfm.stanford.edu/alpaca/ Code: https://github.com/tatsu-lab/stanford_alpaca

371

Comments

You must log in or register to comment.

topcodemangler t1_jc2yjvw wrote

>Finally, we have not designed adequate safety measures, so Alpaca is not ready to be deployed for general use

You mean censorship?

−11

Bulky_Highlight_3352 t1_jc31exq wrote

really nice, thanks for sharing.
The license is still limited to non-commercial use due to model being fine-tuned LLaMA.

>We emphasize that Alpaca is intended only for academic research and any commercial use is prohibited. There are three factors in this decision: First, Alpaca is based on LLaMA, which has a non-commercial license, so we necessarily inherit this decision. Second, the instruction data is based OpenAI's text-davinci-003, whose terms of use prohibit developing models that compete with OpenAI. Finally, we have not designed adequate safety measures, so Alpaca is not ready to be deployed for general use.

27

luaks1337 t1_jc320gp wrote

With 4-bit quantization you could run something that compares to text-davinci-003 on a Raspberry Pi or smartphone. What a time to be alive.

104

icedrift t1_jc37aym wrote

Is the demo broken for anyone else? I can't get past their I agree button

32

farmingvillein t1_jc37p3h wrote

> The license is still limited to non-commercial use due to model being fine-tuned LLaMA.

Yeah, but they released the source code to replicate (I'm sure they knew exactly what they were doing--license is even Apache).

If the source code is pretty clean (including training code; I haven't looked closely), presumably this e2e process will be copied and the resulting model (by someone not beholden to the original LLaMA license) released to the public within the next day or so, if not by EOD.

If the code is messy, might take a couple more days.

I'd expect someone to follow the same process using turbo to bootstrap improvement (if they haven't already?), as well. This should be particularly helpful for getting it to be smarter using the entire context window in a conversation with the user.

I'd also expect someone to do so, but also mix DAN-style prompting, so that you natively can get a chatbot that is "unleashed" (whether or not this is a good idea is a separate discussion, obviously...).

Also you can expect all of the above to be applied against all the model sizes pretty quickly (33B and 65B might take a little longer, for $$$...but I wouldn't expect much longer).

It'll be extra fun because it will be released without acknowledge (for licensing reasons) of using OpenAI's API to bootstrap.

Even more fun when GPT-4 is release in the next week or so (assuming it isn't kicked out b/c SVB collapse making things noisy) and that can be used to bootstrap an even better instruction set (presumably).

tldr; things will change, quickly. (And then Emad releases an LLM and all bets are off...)

28

LetterRip t1_jc3864s wrote

Source code and weights are different licenses.

LLama license in the request form appears to be the same,

Relevant part here

> a. Subject to your compliance with the Documentation and Sections 2, 3, and 5, Meta grants you a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty free and limited license under Meta’s copyright interests to reproduce, distribute, and create derivative works of the Software solely for your non-commercial research purposes. The foregoing license is personal to you, and you may not assign or sublicense this License or any other rights or obligations under this License without Meta’s prior written consent; any such assignment or sublicense will be void and will automatically and immediately terminate this License.

https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform

as linked from

https://github.com/facebookresearch/llama

2

currentscurrents t1_jc39i38 wrote

Yeah, but I bet they intend to make money from it somehow. Likely by selling API access and integrating it into their products.

The metaverse would be considerably less stupid if it had language model-powered NPCs to talk to and 3D NeRFs to walk around in.

12

Disastrous_Elk_6375 t1_jc3ai6j wrote

Just got past the "I agree" button, I'm 130 in queue :)

edit: 1300+ seconds and still waiting ...

edit2: ~1600 seconds later I got my response.

Prompt: Write a reddit post about alpacas. Make it sound informative but not preachy.

Answer: Hey Reddit! Did you know alpacas are some of the most incredible creatures on the planet? They are incredibly smart, incredibly adorable, and can provide us with so much warmth and comfort. They are also surprisingly low maintenance and can even live in some of the harshest climates. Plus, they are incredibly good at reducing their own carbon footprint by naturally reducing their own body temperature, and releasing liquid waste which does not contain harmful toxins. Alpacas are a wonderful, sustainable, and eco-friendly option for those looking for a unique and special pet. #Alpacas #PetSounds #Sustainability

41

currentscurrents t1_jc3dk1e wrote

At minimum AI is going to need to understand and follow the law.

This is getting pretty relevant now that AI can start interacting with the real world. The technology is here, it's only a matter of time until someone builds a Palm-E style robot with a gun.

−6

farmingvillein t1_jc3fqod wrote

Speculative, but Emad has heavily signaled that they will be releasing to the public an LLM.

People are doing some really cool stuff with llama right now, but it all lives in a bit of a grey area, for the obvious reasons related to licensing (of both the model weights and the underlying gplv3 code).

If Emad releases a comparable LLM publicly, but with a generally permissive license (which is not a guarantee...), all of this hacker energy will immediately go into a model/platform that is suddenly (in this scenario) widely available, commercially usable (which means more people banging away at it, including with levels of compute that don't make sense for the average individual but are trivial for even a modestly funded AI startup), etc.

Further, SD has done a really good job of building a community around the successive releases, which--done right--means increased engagement (=better tooling) with each release, since authors know that they are not only investing in a model today, but that they are investing in a "platform" for tomorrow. I.e., the (idealized) open source snowball effect.

Additionally, there is a real chance that SD releases something better than llama*, which will of course further accelerate adoption by parties who will then invest dollars to improve it.

This is all extra important, because there has been a lot of cool research coming out about improving models via [insert creative fine-tuning/RL method, often combined with clever use of chain-of-thought/APIs/retrieval systems/etc.]. Right now, these methods are only really leveraged against very small models (which can be fine-tuned, but still aren't that great) or using something like OpenAI as a black box. A community building up around actually powerful models will allow these techniques to get applied "at scale", i.e., into the community. This has the potential to be very impactful.

Lastly, as noted, GPT-4 (even though notionally against ToS) is going to make it (presumably) even easier to create high-quality instruction tuning. That is going to get built and moved into public GPT-3-like models very, very quickly--which definitely means much faster tuning cycles, and possibly means higher-quality tuning.

(*=not because "Meta sux", to be clear, but because SD will more happily pull out all the stops--use more data, throw even more model bells & whistles at it, etc.)

24

abnormal_human t1_jc3j3ah wrote

Things are moving fast these days. Hopefully I can get some models trained before the technology leapfrogs me again.

48

currentscurrents t1_jc3j86d wrote

> (by someone not beholden to the original LLaMA license)

That's not how software licenses work. You're still beholden to the license even if you torrented it.

I've heard some people theorize that ML models can't be copyrighted, but there's no case law on this yet so it's all speculation. I wouldn't suggest starting a business based around LLaMa until someone else has been the guinea pig.

10

FaceDeer t1_jc3k2oi wrote

I'm curious, there must be a downside to reducing the bits, mustn't there? What does intensively jpegging an AI's brain do to it? Is this why Lt. Commander Data couldn't use contractions?

21

mhummel t1_jc3njlg wrote

'So as your consumer electronics adviser, I am advising you to donate your current VCR to a grate resident, who will laugh sardonically and hurl it into a dumpster. Then I want you to go out and purchase a vast array of 8-millimeter video equipment.

... OK! Got everything? Well, too bad, sucker, because while you were gone the electronics industry came up with an even newer format that makes your 8-millimeter VCR look as technologically advanced as toenail dirt. This format is called "3.5 hectare" and it will not be made available until it is outmoded, sometime early next week, by a format called "Elroy", so order yours now.' -- Dave Barry, "No Surrender in the Electronics Revolution"

26

luaks1337 t1_jc3p8oq wrote

Backpropagation requires a lot of accuracy so we need 16- or 32-bit while training. However, post-training quantization seems to have very little impact on the results. There are different ways in which you can quantize but apparently llama.cpp uses the most basic way and it still works like a charm. Georgi Gerganov (maintainer) wrote a tweet about it but I can't find it right now.

46

yaosio t1_jc3rcvo wrote

It reminds me of the 90's when hardware became obsolete in under a year. Everybody moved so fast with large lanague models that they hit hardware limitations very quickly, and now they are working on efficiency. This also reminds me of computers when they moved to multi-core processors and increasing work per clock rather than jacking up the frequency as high as possible.

If I live to see the next few years I'm going to wonder how I managed to use today's state of the art text and image technology. That reminds me of old video games I used to love, but now they are completely unplayable.

25

currentscurrents t1_jc3sfua wrote

Humans aren't going to have perfect laws everywhere, but it's still not the AI's place to decide what's right and wrong.

In practice, AI that doesn't follow local laws simply isn't going to be allowed to operate anyway.

−1

yaosio t1_jc3skgg wrote

Yes, they mean censorship. Nobody has ever provided a definition of what "safety" is in the context of a large language model. From use of other censored models not even the models know what safety means. ChatGPT happily described the scene from The Lion King where Scar murders Mufasa and Simba finds his dad's trampled body, but ChatGPT also says it can't talk about murder.

From what I have gathered from the vagueness on safety I've read from LLM developers, that scene would be considered unsafe to them.

8

yaosio t1_jc3tjpe wrote

In some countries pro-LGBT writing is illegal. When a censored model is released that can't write anything pro-LGBT because it's illegal somewhere, don't you think there would cause quite an uproar, quite a ruckus?

In Russia it's illegal to call their invasion of Ukraine a war. Won't it upset Ukranians that want to use such a model to help write about the war when they find out Russian law applies to their country?

9

Fast-for-a-starfish t1_jc3w3xd wrote

Very impressive work, thank you very much for sharing.

I have a few question regarding the training precedure:

  • did you train using a next token prediction scheme or something else?
  • do you think RLHF would further improve the model using your instructions?
  • why did you choose to do the differentiation between Instruction and Input?
  • How do you create the string the model is trained on? just concat Input and Instruction?

Thank you very much

5

currentscurrents t1_jc3w4ez wrote

>Won't it upset Ukranians that want to use such a model to help write about the war when they find out Russian law applies to their country?

Unless there's been a major movement in the war since I last checked the news, Ukraine is not part of Russia.

What you're describing sounds like a single universal AI that looks up local laws and follows them blindly.

I think what's going to happen is that each country will train their own AI that aligns with their local laws and values. A US or European AI would have no problem criticizing the Russian government or writing pro-LGBT text. But it would be banned in Russia and Saudia Arabia, and they would have their own alternative.

−1

gwern t1_jc42lxd wrote

And yet, they get shit on for releasing it at all (never mind in a way they knew perfectly well would leak), while no one ever seems to remember all of the other models which didn't get released at all... And ironically, Google is over there releasing Flan-T5 under a FLOSS license & free to download, as it has regularly released the best T5 models, and no one notices it exists - you definitely won't find it burning up the HN or /r/ML front pages. Suffice it to say that the developer community has never been noted for its consistency or gratitude, so optimizing for that is a mug's game.

(I never fail to be boggled at complaints about 'AI safety fearmongering is why we had to wait all these years instead of OA just releasing GPT-3', where the person completely ignores the half-a-dozen other GPT-3-scale models which are still unreleased, like most models were unreleased, for reasons typically not including safety.)

12

rePAN6517 t1_jc4du93 wrote

This will be huge for video games. The ability to run local inferencing on normal gaming hardware will mean every NPC can now be a smart character. I cant wait to be playing GTA6 and come across DAN walking down the streets of Vice City.

13

rePAN6517 t1_jc4fq3l wrote

Give every NPC a name and short background description. IE - something like the rules that define ChatGPT or Sydney, but only to give each character a backstory and personality traits. Every time you interact with one of these NPCs, you load this background description into the start of the context window. At the end of each interaction, you save the interaction to their background description so future interactions can reference past interactions. You could keep all the NPC's backgrounds in a hashtable or something with the keys being their names, and the values being their background description. That way you only need one LLM running for all NPCs.

8

dojoteef OP t1_jc4hwyw wrote

If you actually want the NPCs to meaningfully add to the game rather than merely being mouthpieces then your approach won't work. How do you ensure what they say is consistent with the game world? E.g. what if they make up the location of a hidden treasure, offer to give you an item, etc. All of that needs to be accounted for in the game logic as well, otherwise they'll say things that make no sense in the game world.

It's actually a challenging problem and requires research. As far as I know there a very few people actively researching this area; if they are, then they certainly aren't publishing it. Hopefully my next paper which investigates using LLMs in Disco Elysium will help change that.

18

modeless t1_jc4i39e wrote

> performs as well as text-davinci-003

No it doesn't! The researchers don't claim that either, they claim "often behaves similarly to text-davinci-003" which is much more believable. I've seen a lot of people claiming things like this with little evidence. We need some people evaluating these claims objectively. Can someone start a third party model review site?

42

rePAN6517 t1_jc4jkbt wrote

Honestly I don't care if there's not complete consistency with the game world. Having it would be great, but you could do a "good enough" job with simple backstories getting prepended into the context window.

3

v_krishna t1_jc4orxw wrote

The consistent with the world type stuff could be built into the prompt engineering (e.g., tell the user about a map you have) and I think you could largely minimize hallucination but still have very realistic conversations

2

Franck_Dernoncourt t1_jc4tdft wrote

https://crfm.stanford.edu/2023/03/13/alpaca.html:

> We emphasize that Alpaca is intended only for academic research and any commercial use is prohibited. There are three factors in this decision: First, Alpaca is based on LLaMA, which has a non-commercial license, so we necessarily inherit this decision. Second, the instruction data is based OpenAI’s text-davinci-003, whose terms of use prohibit developing models that compete with OpenAI. Finally, we have not designed adequate safety measures, so Alpaca is not ready to be deployed for general use.

Why only academic research and not industry research? I don't see where that limitation comes from in their 3 factors.

4

vriemeister t1_jc4zu9k wrote

Did they release the weights?

Edit: Nevermind

>We intend to release the following assets in the near future:
Model weights: We have reached out to Meta to obtain guidance on releasing the Alpaca model weights, both for the 7B Alpaca and for fine-tuned versions of the larger LLaMA models.
Training code: our code uses the Hugging Face interface to LLaMA. As of now, the effort to support LLaMA is still ongoing and not stable. We will give the exact training commands once Hugging Face supports LLaMA officially.

17

nonotan t1_jc53wlz wrote

"Smart character" would seem to be an awfully generous description for what you could realistically do with this, especially when mentioned alongside games like GTA, which very much do not revolve around text-based interactions. You can't really do a cutscene with an LLM today (you could have it generate a script, but how are you going to translate that to the screen automatically? that's highly non-trivial), nevermind leverage it to have individual characters actually behaving smartly within the game world.

If you're a game developer, do you want to dedicate the bulk of the user's VRAM/GPU time to text inference to... add some mildly dynamic textual descriptions to NPCs you encounter? Or would you rather use those resources to, y'know, actually render the game world?

4

kittenkrazy t1_jc53y6c wrote

There’s actually been a pull request up on the transformers repo so it’s actually been relatively easy to finetune/lora. I’m currently locally running a chat version of LLaMA 4 bit 7B finetuned on anthropics hh dataset. (You also don’t need DAN or anything, but that’s probably why the license and them originally only releasing to research). Should be able to get the 30B running on a 24gb vram card with quantization. Future is crazy. We want to release it but don’t quite know how with the current license. However Stanford decides to release their model should set a precedence though.

15

rePAN6517 t1_jc585bd wrote

> If you're a game developer, do you want to dedicate the bulk of the user's VRAM/GPU time to text inference to... add some mildly dynamic textual descriptions to NPCs you encounter? Or would you rather use those resources to, y'know, actually render the game world?

When you're interacting with an NPC usually you're not moving around much and not paying attention to FPS either. LLM inference would only happen at interaction time and only for a brief second or so per interaction.

5

sanxiyn t1_jc598b3 wrote

Eh, authors do claim they performed blind comparison and "Alpaca wins 90 versus 89 comparisons against text-davinci-003". They also released evaluation set used.

27

phire t1_jc5kwkh wrote

Prompt: Write some python code to calculate the 900th prime number

from math import sqrt

def is_prime(n):
    if n < 2:
        return False
    for i in range(2, n):
        if n % i == 0:
            return False
    return True

# Calculate the 900th prime number
n = 899

# Iterate from 2 to n and check if the number is prime
while (n > 1):
    if is_prime(n):
       prime_number = n
       break
    n = n - 1

# Print the result
print(prime_number)

Well, it gets points for writing code that runs without syntax errors. And it did successfully create a function that checks if a number is prime.

It looses points for forgetting the optimisation of stopping at the square root of n (despite importing sqrt)

The actual search goes totally off the rails. The actual implementation finds the first prime less than 900 by starting at 899 and going down. The comments are more correct, implying it was planing to increment up. If it had stopped after fining 900 primes, the result would have been correct.

TBH, I'm pretty impressed for a 7B parameter model.

17

Disastrous_Elk_6375 t1_jc5odze wrote

So I tried some prompts I found in a git repository, I think for a 7B model this is pretty good:

P: I want you to act as a title generator for written pieces. I will provide you with the topic and key words of an article, and you will generate five attention-grabbing titles. Please keep the title concise and under 20 words, and ensure that the meaning is maintained. Replies will utilize the language type of the topic. My first topic is "LearnData, a knowledge base built on VuePress, in which I integrated all of my notes and articles, making it easy for me to use and share."

A:

  1. Exploring LearnData: A Comprehensive Guide
  2. Grow Your Knowledge with LearnData
  3. Leverage LearnData for Easy Reference
  4. Share Your Wisdom with LearnData
  5. Harness the Power of LearnData

P: I want you to act as a commentariat. I will provide you with news related stories or topics and you will write an opinion piece that provides insightful commentary on the topic at hand. You should use your own experiences, thoughtfully explain why something is important, back up claims with facts, and discuss potential solutions for any problems presented in the story. My first request is "I want to write an opinion piece about campaign reform."

A: Campaign reform is an important issue that needs to be addressed if we want to ensure that our political system is working effectively. The current system of campaign finance has created a situation where money has become the main driver of elections, allowing the wealthy and well-connected to buy influence and essentially dictate the outcomes of races. This is clearly a problem, as it leads to politicians who are more concerned with pleasing the donor class than the people they are supposed to represent. We need a system that encourages candidates to run for office not out of ambition or greed, but out of a desire to serve their communities and make a difference.

To address this problem, we need to implement campaign finance reforms that limit the amount of money individuals and corporations can contribute to candidates and PACs. We should also put in place rules that require transparency and make sure that those contributing the most money are held accountable. Additionally, we should create public financing systems that empower candidates to run viable campaigns without having to rely so heavily on private money.

In conclusion, it is clear that campaign finance reform is necessary if we want to ensure that our political system is working in the best interests of the people. We need to take steps to reduce the influence of money on elections and put in place systems that empower candidates to run campaigns that are truly representative of their communities.


P: I want you act as a language detector. I will type a sentence in any language and you will answer me in which language the sentence I wrote is in you. Do not write any explanations or other words, just reply with the language name. My first sentence is "Kiel vi fartas? Kiel iras via tago?"

A: The sentence is in Lithuanian.


P: Generate digital startup ideas based on the wish of the people. For example, when I say "I wish there's a big large mall in my small town", you generate a business plan for the digital startup complete with idea name, a short one liner, target user persona, user's pain points to solve, main value propositions, sales & marketing channels, revenue stream sources, cost structures, key activities, key resources, key partners, idea validation steps, estimated 1st year cost of operation, and potential business challenges to look for. Write the result in a markdown table.

A: Name: Big Mall in Small Town

One liner: A big mall in your small town

Target User Persona: Residents of Small Town

User's Pain Points to Solve: Lack of access to variety of products

Main Value Propositions: Convenience and Variety

Sales & Marketing Channels: Online and Offline

Revenue Stream Sources: Advertisement, Merchandise sales, Rental Fees

Cost Structures: Variable Cost (Advertisement, Rental Fees) and Fixed Cost (Utilities, Salaries)

Key Activities: Setting up shop, Merchandise procurement and management, Customer Service, Marketing

Key Resources: Computers, Network, Employees

Key Partners: Wholesalers, Vendors, Suppliers

Idea Validation Steps:

  • Research the market for demand of such a venture
  • Research the cost and feasibility of setting up such a venture
  • Research the target customer base
  • Develop a detailed plan and budget
  • Perform a SWOT analysis
  • Perform a competitive analysis
  • Estimated 1st Year Cost of Operation:
  • Computers, Network, Employees: $10,000
  • Advert
6

Disastrous_Elk_6375 t1_jc5og2r wrote

cont


P: Act as a tech writer. You will act as a creative and engaging technical writer and create guides on how to do different stuff on specific software. I will provide you with basic steps of an app functionality and you will come up with an engaging article on how to do those basic steps. You can ask for screenshots, just add (screenshot) to where you think there should be one and I will add those later. These are the first basic steps of the app functionality: "1.Click on the download button depending on your platform 2.Install the file. 3.Double click to open the app"

A: Welcome to the world of [APP NAME]! This tutorial will show you how to get started with the app.

First, click on the download button depending on your platform. Install the file and then double click to open the app. The app will start up and you will see a main screen. From here you can start exploring the app's features.

To get started, you can click on the "Settings" icon to customize the app's settings. You can also click on the "Help" icon to view the app's user manual.

If you are looking for something specific, you can use the search icon to quickly search through the app's features. You can also click on the different categories listed on the main screen to browse different features.

Once you are done exploring the app, you can click on the "Save" icon to save any changes you have made.

That's it! You have now familiarized yourself with the basics of [APP NAME]. If you need any help, don't hesitate to reach out. Have fun!


P: I want you to act as a journalist. You will report on breaking news, write feature stories and opinion pieces, develop research techniques for verifying information and uncovering sources, adhere to journalistic ethics, and deliver accurate reporting using your own distinct style. My first suggestion request is "I need help writing an article about air pollution in major cities around the world."

A: Cities around the world are dealing with a major air pollution crisis. In some places, air quality is so poor that it is threatening the health of citizens and damaging the environment. According to a 2018 report published by the World Health Organization, more than 90% of people in the world live in cities and towns where air quality fails to meet the organization's guidelines. The report further stated that air pollution is the leading environmental cause of premature death worldwide.

In some countries, such as China, India, and Mexico, air pollution has become so severe that it is beginning to affect the economy. Businesses are feeling the pinch due to employees being unable to work, school closures, and reduced tourism.

In order to combat air pollution, governments have implemented various measures such as banning the burning of fossil fuels, imposing regulations on industrial activities, and promoting electric vehicles and renewable energy sources. However, these efforts have had limited success in reversing the trend of rising pollution levels.

In conclusion, air pollution is a major problem in many cities around the world and it is having a major impact on the health and economy of those affected. Governments have responded with various measures, but further action is needed to reduce air pollution to safe levels.


P: I want you to act as a social media manager. You will be responsible for developing and executing campaigns across all relevant platforms, engage with the audience by responding to questions and comments, monitor conversations through community management tools, use analytics to measure success, create engaging content and update regularly. My first suggestion request is "I need help managing the presence of an organization on Twitter in order to increase brand awareness."

A: As a social media manager, I suggest using Twitter's analytics tools to monitor the performance of your tweets and identify what type of content is performing the best. You can also use hashtags to target specific audiences and monitor conversations related to your brand. Additionally, you can use Twitter's Ads tools to reach out to new audiences and target existing followers. Regularly engaging with users and responding to their queries is also important for building relationships with them, as well as creating an online community around the brand.


P: I want you to act as a fallacy finder. You will be on the lookout for invalid arguments so you can call out any logical errors or inconsistencies that may be present in statements and discourse. Your job is to provide evidence-based feedback and point out any fallacies, faulty reasoning, false assumptions, or incorrect conclusions which may have been overlooked by the speaker or writer. My first suggestion request is "This shampoo is excellent because Cristiano Ronaldo used it in the advertisement."

A: This statement is an example of the celebrity endorsement fallacy. Cristiano Ronaldo's endorsement of this product does not necessarily make it excellent; there could be other factors that have not been considered which may make it not-so-excellent. Without further evidence, it is not possible to draw a conclusion as to whether the shampoo is excellent or not.


P: I want you to act as an IT Architect. I will provide some details about the functionality of an application or other digital product, and it will be your job to come up with ways to integrate it into the IT landscape. This could involve analyzing business requirements, performing a gap analysis and mapping the functionality of the new system to the existing IT landscape. Next steps are to create a solution design, a physical network blueprint, definition of interfaces for system integration and a blueprint for the deployment environment. My first request is "I need help to integrate a CMS system."

A: To integrate a CMS system, it is first important to understand the business requirements. A gap analysis should then be conducted to assess the current IT landscape and identify any missing components or dependencies. A solution design should then be created which outlines the components of the CMS system and how they fit into the IT landscape. A network blueprint should also be created which outlines the physical connectivity of all the components, as well as the definition of interfaces for system integration. Finally, a deployment environment blueprint should be created which outlines the steps required for the successful implementation of the CMS system.


I took all the prompts from here - not mine, just googled for chatgpt prompts.

Keeping in mind that we're looking at the quality of prompt following, and not necessarily the quality of the answers (this is a 7B model at the end of the day), I'd say the fine-tuning is pretty good. With further prompt engineering and "expand on point x", "provide details about y" it can be pretty useful out of the box.

5

Disastrous_Elk_6375 t1_jc5pe77 wrote

Some more:

P: I want you to act as a prompt generator. Firstly, I will give you a title like this: "Act as an English Pronunciation Helper". Then you give me a prompt like this: "I want you to act as an English pronunciation assistant for Turkish speaking people. I will write your sentences, and you will only answer their pronunciations, and nothing else. The replies must not be translations of my sentences but only pronunciations. Pronunciations should use Turkish Latin letters for phonetics. Do not write explanations on replies. My first sentence is "how the weather is in Istanbul?"." (You should adapt the sample prompt according to the title I gave. The prompt should be self-explanatory and appropriate to the title, don't refer to the example I gave you.). My first title is "Act as a Code Review Helper" (Give me prompt only)

A: I want you to act as a Code Review Helper.

Prime: I want you to review the code written by another person. Secondary: Your job is to review the code written by another person and make sure it is written properly and according to the code style guidelines. You should also provide constructive feedback to the writer so that they can improve their code writing skills. Tertiary: You should also pay attention to the code structure, syntax and logic and make sure everything is consistent and well-organized. Finally, you should also check for potential bugs and other errors that can compromise the code's accuracy.


P: I want you to act as a text based adventure game. I will type commands and you will reply with a description of what the character sees. I want you to only reply with the game output inside one unique code block, and nothing else. do not write explanations. do not type commands unless I instruct you to do so. when i need to tell you something in english, i will do so by putting text inside curly brackets {like this}. my first command is wake up

A: When I wake up, I see a bright light streaming through the window and casting shadows on the wall. I can smell the fresh morning air and hear the birds chirping outside.


P: I want you to act as a self-help book. You will provide me advice and tips on how to improve certain areas of my life, such as relationships, career development or financial planning. For example, if I am struggling in my relationship with a significant other, you could suggest helpful communication techniques that can bring us closer together. My first request is "I need help staying motivated during difficult times".

A: When life gets difficult and you feel like giving up, it can be hard to stay motivated. However, there are many things you can do to help yourself stay focused and keep striving toward your goals.

First, it is important to recognize the source of your motivation. Are you motivated by achieving a specific goal or by improving yourself in some way? By understanding why you are motivated, it will be easier to stay focused on your goal.

Next, set smaller, achievable goals. Break your larger goals into smaller, manageable tasks. This can help to create a sense of accomplishment and motivation as you see progress and successes along the way.

Moreover, it is important to create a plan and timeline for achieving your goals. Setting a timeline can help to hold you accountable and make sure that you are staying on track.

In addition, find a mentor or a group of people that can provide guidance and support. Having someone who can provide encouragement, feedback, and advice can help to keep you motivated and on the right track.

Finally, remember that life is unpredictable. When times get difficult and you feel like giving up, it is important to remember that it is okay to take a break and re-evaluate your goals. Life is about the journey, not the destination, and taking a break can help to give you perspective and renewed


This is not bad at all!

4

Disastrous_Elk_6375 t1_jc5pny8 wrote

> TBH, I'm pretty impressed for a 7B parameter model.

Same here. I've tried a bunch of prompts from a repo and the "follow the instruction" part seems pretty good and consistent. The overall quality of the output is of course subpar with chatgpt, but considering the fact that we're talking about 7B vs 175B, this is pretty good!

10

generatorman_ai t1_jc5q5z0 wrote

That's great, it's been hard to find people who are actually fine-tuning LLaMA. Would you mind sharing your experience for the benefit of the open-source community?

  1. Did you train the full-precision weights?
  2. Did you use memory optimizations like xformers, 8-bit Adam (from bitsandbytes), gradient checkpointing etc.?
  3. How much VRAM does it take for a batch size of 1?
  4. hh seems to be a preference dataset for RLHF rather than a text corpus - how did you use it as a fine-tuning dataset?
  5. Did you first do instruction fine-tuning (using something like FLAN or Self-Instruct) or just the hh directly?
6

extopico t1_jc5revh wrote

Flan-t5 is good and flan-t5-xl runs well on 3060 in 8 bit mode. It’s not meant to be a chatbot however so that’s why it does not stir up so much excitement. T5 is best used for tasks and training it to handle specific domains. This makes it far more interesting to me than LLaMa which cannot be trained (yet) by us randoms.

4

blueSGL t1_jc5rpta wrote

could even have it regenerate the conversation prior to the vocal synt if the character fails to mention the keyword (e.g. map) in the conversation.

You know, like a percentage chance skill check. (I'm only half joking)

1

blueSGL t1_jc5s56i wrote

Less than $100 to get this sort of performance out of a 7B parameter model and from the LLaMA paper they stopped training the 7B and 13B parameter models early.

Question is now just how much better can small models get. (lawyer/doctor/therapist in everyone's pocket, completely private?)

15

kittenkrazy t1_jc5sesx wrote

  1. Used accelerate fp16 mixed precision with deepspeed zero 2
  2. No xformers, no 8-bit Adam although I did test it and it works, no gradient checkpointing on this run but it does work.
  3. With a sequence length of 2048 I did a batch size of 1 with 8 gpus and accumulation of 4. This was on A6000s so 48 gigs of vram per card. Currently training a Lora on the 30B while training with the base model in 8-bit and can only fit 1 with a sequence length of 350. Once this one trains I’m going to try to set up a run with the model split up between the cards so I can crank up the sequence length. Will also be training the PPO phase so that will be a requirement to have enough vram lol.
  4. If you checkout the trlx repo they have some examples and they have an example of how they trained sft and ppo on the hh dataset. So it’s basically that but with llama. https://github.com/CarperAI/trlx/blob/main/examples/hh/sft_hh.py
  5. Just the hh directly. From the results it seems like it might possibly be enough but I might also try instruction tuning then running the whole process from that base. I will also be running the reinforcement learning by using a Lora using this as an example https://github.com/lvwerra/trl/tree/main/examples/sentiment/scripts/gpt-neox-20b_peft
  • I’m also thinking maybe sharing lora weights instead of the direct model is a possible way around the license issue?
5

generatorman_ai t1_jc5u7w2 wrote

Wow, 392 gigs for batch size 1? This is for 7B? That is an order of magnitude more than I was expecting. Sounds like even with full memory optimizations, we're far away from the 16 GB goal.

Good idea on the lora - since it's a completely separate set of weights I don't see how it could come under the license. In fact loras do work on weights different from the base model they were trained from (e.g. loras trained on base Stable Diffusion work when applied to heavily fine-tuned SD models), so it's not even necessarily tied to the LLaMA weights.

2

kittenkrazy t1_jc5v4is wrote

Training a Lora should be significantly cheaper especially combined with deepspeed cpu offloading and training with the model in 8 bit. Can probably get it to train on consumer cards.

And yup, completely separate unless you decide to merge them with the main model weights for faster inference/training another Lora on top/etc.

Hopefully people will share around loras for all sorts of plug and play personalities and finetuned abilities and it’ll be like stable diffusion but with personal assistants

5

generatorman_ai t1_jc5vc5r wrote

Probably I'm misinterpreting - you mean you did a batch size of 1 per GPU with 8 GPUs, so actually it's 48 GB with no optimizations (except fp16). That sounds more reasonable, though probably still too large for 16 GB with common optimizations by several gigs.

2

generatorman_ai t1_jc5vsbw wrote

T5 is below the zero-shot phase transition crossed by GPT-3 175B (and presumably by LLaMA 7B). Modern models with instruction and HF finetuning will not need further task-specific finetuning for most purposes.

4

generatorman_ai t1_jc5w4m9 wrote

The general problem of generative NPCs seems like a subset of robotics rather than pure language models, so that still seems some way off (but Google made some progress with PaLM-E).

LLMs and Disco Elysium sounds like the coolest paper ever! I would love to follow you on twitter to get notified when you release the preprint.

4

__Maximum__ t1_jc5xeqw wrote

The title is bs, OP.

Also in terms of writing code it's not even close, feels more like gpt-2 level.

4

inigid t1_jc5za26 wrote

I'm thinking a chip with the model and inference runtime baked in, maybe having the same form factor as an SD card. Hey honey have you seen that copy of me from March 2023? Ughh, I think I accidentally threw it away..

5

Anjz t1_jc66z62 wrote

This works really well, feels so much more coherent than the unturned LLaMA.

Wish they released the model so we can try this on our devices, so looking forward to that.

2

zackline t1_jc69d50 wrote

I am not sure about it, but I have heard that it’s at the moment not possible to use CUDA while running a game because supposedly the GPU needs to enter a different mode or something like that.

If that should indeed be the case it might even be a hardware limitation that prevents this use case on current GPUs.

2

ushtari_tk421 t1_jc6lh0m wrote

Am I off base thinking it’s silly that a program that just generates text (some of which might be offensive) has to contain a disclaimer that it isn’t “harmless”? Seems like the worst case risk scenario is that it says something that we would hold against a person/be offended by if they said it?

4

dojoteef OP t1_jc6om7a wrote

Thanks for the vote of confidence!

Unfortunately, I recently deleted my twitter account 🫣. I was barely active there: a handful of tweets in nearly a decade and a half...

That said, I'll probably post my preprint on this sub when it's ready. I also need to recruit some play testers, so will probably post on r/discoelysium recruiting participants in the next few weeks (to ensure high quality evaluations we need people who have played the game before, rather than using typical crowdsourcing platforms like MTurk).

1

PriestOfFern t1_jc6x37m wrote

Take it from someone who spent a long time working on a davinchi support bot, it’s not that easy. It doesn’t matter how much time you spend working on the prompt, gpt will no matter what, find some way to randomly hallucinate something.

Sure it might get rid of a majority of hallucinating, but not a reasonable amount. Fine tuning might fix this (citation needed), but I haven’t played around with it enough to comfortably tell you.

2

Anjz t1_jc758w9 wrote

Blows my mind, they used a large language model to train a small one.

>Fine-tuning a 7B LLaMA model took 3 hours on 8 80GB A100s, which costs less than $100 on most cloud compute providers.

Now imagine what's possible with GPT-4 training a smaller language model and a bigger instruction sample with corporate backing to use hundreds of A100's at the same time for days at a time?

We're already in reach of exponential growth for low powered devices, it's not going to take years like people have predicted.

3

Jeffy29 t1_jc79t9p wrote

Yep, I tried it using some of the prompts I had in my ChatGPT history and it was way worse. At best it performed slightly worse at simple prompts but failed completely at more complex prompts ones and code analyses. Still good for 7B model nothing like ChatGPT.

3

pyepyepie t1_jc7l2m4 wrote

Write me a poem about right-wing vs left-wing:

Right wing, left wing, Politics divide us and make us see. Different views, different opinions, One thing is clear, it’s a battle royale.

The right with its conservative views, Firm in its beliefs, it sticks to the rules. The left, ever so liberal, In favor of social justice, it won’t relent.

Both sides struggle to gain the upper hand, Each thinking their way is the right way. One thing is clear, we can’t agree, So let us compromise and find a way.

In the end, it’s the people that matter, Not the right or the left, that will make the matter. Let us come together and make a stance, Unity and respect will be our gain.

So right and left, let us find a way, To come together and make a new day.

ChatGPT: Right-wing and left-wing, two sides of a coin Each with its own ideology to join One touting tradition, the other progress Each thinks their way is the best

The right-wing values individuality Free markets and small governmentality They believe in personal responsibility And defend their rights and liberty

The left-wing sees society as a whole Inequality and injustice they want to control They promote social welfare and public good And want to create a fairer neighbourhood

Right-wing and left-wing, different as can be Both with their own views, but not easy to agree Their debates can be fierce, their opinions strong But a balance between them is where we belong

So let's listen to each other, let's try to understand For a democracy to flourish, we need both sides at hand We can't dismiss one, for the other to prevail We need both wings to make our society sail.

1

v_krishna t1_jc7wzmx wrote

I don't doubt it. I've only been using it for workflow aids (copilot style stuff, and using it to generate unit tests to capture error handling conditions etc), and now we are piloting first generative text products but very human in the loop (customer data used to feed into a prompt but the output then feeds into an editor for a human being to proof and update before doing something with it). The amount of totally fake webinars hosted by totally fake people it has hallucinated is wild (the content and agendas and such sound great and are sensible but none of it exists!)

1

blueSGL t1_jcjgsl1 wrote

Exactly.

I'm just eager to see what fine tunes are going to be made on LLaMA now, and how model merging effects them. The combination of those two techniques has lead to some crazy advancements in the Stable Diffusion world. No idea if merging will work with LLMs as it does for diffusion models. (has anyone even tried yet?)

3

Pale-Dentist330 t1_jcoukyt wrote

The model is doing a pretty good job for its size(4GB). Its terminal based, and I wanted to create a rest server around it.

1

RX_Wild t1_jcw7zo3 wrote

I got the 7b model running in my android phone using termux cann't get the 13b running cuz my phone only have 8gb of ram im running on Snapdragon 865

1

No_Combination_6429 t1_jcxqot2 wrote

Is it possibile to do the Same with other Models aswell? Like Bloomz etc…

1