Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Prototype
Items
Properties
All Categories
Recent changes
Random page
Help about MediaWiki
Wiki editing manual
Philosophical Research
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Philosophical Research:The LLM Olympics
(section)
Project page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Special pages
Page information
In other projects
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Rules == Three very important rules will be followed in all of these tests: <strong>1)</strong> Absolutely no online models will be used, only models that can be run entirely offline. This is mainly for the ethical concern of making sure that running the models does not use more computing power or rack space than a regular computer program. However, it also has the benefit of creating the simplest test cases with no external variables. If there is only 1 gigabyte of model or less and not 10 more gigabytes of model hiding out of view, it is easier to know the full range of behaviors of the model, and if nobody else is running the model, there will not be any external actions "the company" can take at the same time the test is running such as datamining conversations or inserting ads. All the causes and effects inside the test will be in one place. <strong>2)</strong> No generated sentences will be directly copied onto any page. All the text on these pages is created manually. The longest quotations of generated text on these pages will be approximately three words long. <strong>3)</strong> The LLM must not be given an unreasonable task, only tasks which fit within the boundaries of its known programming, bugs, and quirks. Each task will include several steps of "testing understanding" to make sure the LLM is getting the intended answers at every single step before then giving it harder questions requiring inference and not directly explained in the text. Unless the task proves to be truly impossible, the test will not stop until the LLM actually completes the task.
Summary:
Please note that all contributions to Philosophical Research may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar
free resource
.
Copyright is complete nonsense
, but people do have to buy items to be able to charge anyone taxes.
Cancel
Editing help
(opens in new window)