没有合适的资源?快使用搜索试试~ 我知道了~
首页"GPT-4技术报告:多模态模型在人类水平表现出色"。
"GPT-4技术报告:多模态模型在人类水平表现出色"。
需积分: 3 5 下载量 2 浏览量
更新于2024-03-16
收藏 1.11MB PDF 举报
he development of GPT-4 marks a significant advancement in artificial intelligence technology. GPT-4 is a large-scale, multimodal model that is capable of accepting both image and text inputs, and producing human-like text outputs. While GPT-4 may not be as capable as humans in all real-world scenarios, it has demonstrated human-level performance on various professional and academic benchmarks.
One of the key features of GPT-4 is its ability to pass a simulated bar exam with a score that places it in the top 10% of test takers. This showcases the model's advanced capabilities in understanding and processing complex legal text and information. GPT-4 is based on the Transformer architecture and is pre-trained to predict the next token in a document, allowing it to generate coherent and contextually accurate text outputs.
Overall, GPT-4 represents a significant advancement in natural language processing and artificial intelligence, pushing the boundaries of what is possible in terms of human-like text generation. Its ability to understand and generate text in a variety of contexts makes it a valuable tool for a wide range of applications, from professional and academic settings to creative writing and content generation. As technology continues to evolve, GPT-4 stands as a testament to the potential of artificial intelligence to enhance and augment human capabilities.
Authorship, Credit Attribution, and Acknowledgements
Please cite this work as “OpenAI (2023)”.
Pretraining
Core contributors
10
Christopher Berner Supercomputing lead
Greg Brockman Infrastructure lead
Trevor Cai Throughput lead
David Farhi Manager of optimization team
Chris Hesse Infrastructure usability co-lead
Shantanu Jain Infrastructure usability co-lead
Kyle Kosic Uptime and stability lead
Jakub Pachocki Overall lead, optimization lead
Alex Paino Architecture & data vice lead
Mikhail Pavlov Software correctness lead
Michael Petrov Hardware correctness lead
Nick Ryder Architecture & data lead
Szymon Sidor Optimization vice lead
Nikolas Tezak Execution lead
Phil Tillet Triton lead
Amin Tootoonchian Model distribution, systems & networking lead
Qiming Yuan Dataset sourcing and processing lead
Wojciech Zaremba Manager of dataset team
Compute cluster scaling
10
Christopher Berner, Oleg Boiko, Andrew Cann, Ben Chess, Christian
Gibson, Mateusz Litwin, Emy Parparita, Henri Roussez, Eric Sigler,
Akila Welihinda
Data
10
Sandhini Agarwal, Suchir Balaji, Mo Bavarian, Che Chang, Sheila
Dunning, Leo Gao, Jonathan Gordon, Peter Hoeschele, Shawn Jain,
Shantanu Jain, Roger Jiang, Heewoo Jun, Łukasz Kaiser, Nitish
Shirish Keskar, Jong Wook Kim, Aris Konstantinidis, Chak Li, Todor
Markov, Bianca Martin, David Mély, Oleg Murk, Hyeonwoo Noh,
Long Ouyang, Alex Paino, Vitchyr Pong, Alec Radford, Nick Ryder,
John Schulman, Daniel Selsam, Chelsea Voss, Lilian Weng, Clemens
Winter, Tao Xu, Qiming Yuan, Wojciech Zaremba
Distributed training infrastructure
10
Greg Brockman, Trevor Cai, Chris Hesse, Shantanu Jain, Yongjik Kim,
Kyle Kosic, Mateusz Litwin, Jakub Pachocki, Mikhail Pavlov, Szymon
Sidor, Nikolas Tezak, Madeleine Thompson, Amin Tootoonchian,
Qiming Yuan
Hardware correctness
10
Greg Brockman, Shantanu Jain, Kyle Kosic, Michael Petrov, Nikolas
Tezak, Amin Tootoonchian, Chelsea Voss, Qiming Yuan
Optimization & architecture
10
Igor Babuschkin, Mo Bavarian, Adrien Ecoffet, David Farhi, Jesse
Han, Ingmar Kanitscheider, Daniel Levy, Jakub Pachocki, Alex Paino,
Mikhail Pavlov, Nick Ryder, Szymon Sidor, Jie Tang, Jerry Tworek,
Tao Xu
Training run babysitting
10
Suchir Balaji, Mo Bavarian, Greg Brockman, Trevor Cai, Chris Hesse,
Shantanu Jain, Roger Jiang, Yongjik Kim, Kyle Kosic, Mateusz Litwin,
Jakub Pachocki, Alex Paino, Mikhail Pavlov, Michael Petrov, Nick
Ryder, Szymon Sidor, Nikolas Tezak, Madeleine Thompson, Phil
Tillet, Amin Tootoonchian, Chelsea Voss, Ben Wang, Tao Xu, Qiming
Yuan
Long context
Core contributors
10
Gabriel Goh Long context co-lead
Łukasz Kaiser Long context lead
Clemens Winter Long context co-lead
Long context research
10
Mo Bavarian, Gabriel Goh, Łukasz Kaiser, Chak Li, Ben Wang,
Clemens Winter
Long context kernels
10
Phil Tillet
Vision
Core contributors
10
Trevor Cai Execution lead
Mark Chen Vision team co-lead, Deployment lead
Casey Chu Initial prototype lead
Chris Hesse Data load balancing & developer tooling lead
Shengli Hu Vision Safety Evaluations lead
Yongjik Kim GPU performance lead
Jamie Kiros Overall vision co-lead, deployment research & evals lead
Daniel Levy Overall vision co-lead, optimization lead
Christine McLeavey Vision team lead
David Mély Data lead
Hyeonwoo Noh Overall vision co-lead, research lead
Mikhail Pavlov Scaling engineering lead
Raul Puri Overall vision co-lead, engineering lead
Amin Tootoonchian Model distribution, systems & networking lead
Architecture research
10
Casey Chu, Jamie Kiros, Christine McLeavey, Hyeonwoo Noh, Raul
Puri, Alec Radford, Aditya Ramesh
Compute cluster scaling
10
Andrew Cann, Rory Carmichael, Christian Gibson, Henri Roussez,
Akila Welihinda
Distributed training infrastructure
10
Trevor Cai, Yunxing Dai, Chris Hesse, Brandon Houghton, Yongjik
Kim, Łukasz Kondraciuk, Hyeonwoo Noh, Mikhail Pavlov, Raul Puri,
Nikolas Tezak, Amin Tootoonchian, Tianhao Zheng
Hardware correctness
10
Oleg Boiko, Trevor Cai, Michael Petrov, Alethea Power
Data
10
Jong Wook Kim, David Mély, Reiichiro Nakano, Hyeonwoo Noh,
Long Ouyang, Raul Puri, Pranav Shyam, Tao Xu
Alignment data
10
Long Ouyang
Training run babysitting
10
Trevor Cai, Kyle Kosic, Daniel Levy, David Mély, Reiichiro Nakano,
Hyeonwoo Noh, Mikhail Pavlov, Raul Puri, Amin Tootoonchian
Deployment & post-training
10
Ilge Akkaya, Mark Chen, Jamie Kiros, Rachel Lim, Reiichiro Nakano,
Raul Puri, Jiayi Weng
Reinforcement Learning & Alignment
Core contributors
10
Greg Brockman Core infrastructure author
Liam Fedus Data flywheel lead
Tarun Gogineni Model creativity
Rapha Gontijo-Lopes Synthetic data
Joshua Gross Data collection engineering co-lead
Johannes Heidecke Refusals & model safety co-lead
Joost Huizinga Initial fine-tuning derisking
Teddy Lee Human Data Product Manager
Jan Leike Alignment co-lead
Ryan Lowe Alignment co-lead
Luke Metz Infrastructure lead, ChatML format lead
Long Ouyang IF data collection lead
John Schulman Overall lead
Jerry Tworek Code lead
Carroll Wainwright IF data infrastructure lead
Jonathan Ward Data collection engineering co-lead
Jiayi Weng RL Infrastructure author
Sarah Yoo Human Data Operations Manager
Wojciech Zaremba Human data lead
Chong Zhang Refusals & model safety co-lead
Shengjia Zhao Reward model lead
Barret Zoph Overall training lead
Dataset contributions
10
Diogo Almeida, Mo Bavarian, Juan Felipe Cerón Uribe, Tyna Eloun-
15
dou, Liam Fedus, Tarun Gogineni, Rapha Gontijo-Lopes, Jonathan
Gordon, Joost Huizinga, Shawn Jain, Roger Jiang, Łukasz Kaiser,
Christina Kim, Jan Leike, Chak Li, Stephanie Lin, Ryan Lowe, Jacob
Menick, Luke Metz, Pamela Mishkin, Tong Mu, Oleg Murk, Ashvin
Nair, Long Ouyang, Alex Passos, Michael (Rai) Pokorny, Vitchyr
Pong, Shibani Santurkar, Daniel Selsam, Sarah Shoker, Carroll Wain-
wright, Matt Wiethoff, Jeff Wu, Kai Xiao, Kevin Yu, Marvin Zhang,
Chong Zhang, William Zhuk, Barret Zoph
Data infrastructure
10
Irwan Bello, Lenny Bogdonoff, Juan Felipe Cerón Uribe, Joshua
Gross, Shawn Jain, Haozhun Jin, Christina Kim, Aris Konstantinidis,
Teddy Lee, David Medina, Jacob Menick, Luke Metz, Ashvin Nair,
Long Ouyang, Michael (Rai) Pokorny, Vitchyr Pong, John Schulman,
Jonathan Ward, Jiayi Weng, Matt Wiethoff, Sarah Yoo, Kevin Yu,
Wojciech Zaremba, William Zhuk, Barret Zoph
ChatML format
10
Ilge Akkaya, Christina Kim, Chak Li, Rachel Lim, Jacob Menick,
Luke Metz, Andrey Mishchenko, Vitchyr Pong, John Schulman,
Carroll Wainwright, Barret Zoph
Model safety
10
Josh Achiam, Steven Adler, Juan Felipe Cerón Uribe, Hyung Won
Chung, Tyna Eloundou, Rapha Gontijo-Lopes, Shixiang Shane Gu,
Johannes Heidecke, Joost Huizinga, Teddy Lee, Jan Leike, Stephanie
Lin, Ryan Lowe, Todor Markov, Luke Metz, Tong Mu, Shibani
Santurkar, John Schulman, Andrea Vallone, Carroll Wainwright, Jason
Wei, Lilian Weng, Kai Xiao, Chong Zhang, Marvin Zhang, Barret Zoph
Refusals
10
Juan Felipe Cerón Uribe, Tyna Eloundou, Johannes Heidecke, Joost
Huizinga, Jan Leike, Stephanie Lin, Ryan Lowe, Pamela Mishkin,
Tong Mu, Carroll Wainwright, Lilian Weng, Kai Xiao, Chong Zhang,
Barret Zoph
Foundational RLHF and InstructGPT work
10
Diogo Almeida, Joost Huizinga, Roger Jiang, Jan Leike, Stephanie Lin,
Ryan Lowe, Pamela Mishkin, Dan Mossing, Long Ouyang, Katarina
Slama, Carroll Wainwright, Jeff Wu, Kai Xiao, Marvin Zhang
Flagship training runs
10
Greg Brockman, Liam Fedus, Johannes Heidecke, Joost Huizinga,
Roger Jiang, Kyle Kosic, Luke Metz, Ashvin Nair, Jiayi Weng, Chong
Zhang, Shengjia Zhao, Barret Zoph
Code capability
10
Ilge Akkaya, Mo Bavarian, Jonathan Gordon, Shawn Jain, Haozhun
Jin, Teddy Lee, Chak Li, Oleg Murk, Ashvin Nair, Vitchyr Pong,
Benjamin Sokolowsky, Jerry Tworek, Matt Wiethoff, Sarah Yoo, Kevin
Yu, Wojciech Zaremba, William Zhuk
Evaluation & analysis
Core contributors
10
Sandhini Agarwal System card co-lead
Lama Ahmad Expert red teaming & adversarial testing program lead
Mo Bavarian Capability prediction co-lead
Tyna Eloundou Safety evaluations co-lead
Andrew Kondrich OpenAI Evals open-sourcing co-lead
Gretchen Krueger System card co-lead
Michael Lampe Privacy and PII evaluations lead
Pamela Mishkin Economic impact & overreliance evaluations lead
Benjamin Sokolowsky Capability prediction co-lead
Jack Rae Research benchmark execution lead
Chelsea Voss Eval execution lead
Alvin Wang OpenAI Evals lead
Kai Xiao Safety evaluations co-lead
Marvin Zhang OpenAI Evals open-sourcing co-lead
OpenAI Evals library
10
Shixiang Shane Gu, Angela Jiang, Logan Kilpatrick, Andrew Kon-
drich, Pamela Mishkin, Jakub Pachocki, Ted Sanders, Jessica Shieh,
Alvin Wang, Marvin Zhang
Model-graded evaluation infrastructure
10
Liam Fedus, Rapha Gontijo-Lopes, Shixiang Shane Gu, Andrew
Kondrich, Michael (Rai) Pokorny, Wojciech Zaremba, Chong Zhang,
Marvin Zhang, Shengjia Zhao, Barret Zoph
Acceleration forecasting
10
Alan Hickey, Daniel Kokotajlo, Cullen O’Keefe, Sarah Shoker
ChatGPT evaluations
10
Juan Felipe Cerón Uribe, Hyung Won Chung, Rapha Gontijo-Lopes,
Liam Fedus, Luke Metz, Michael Rai Pokorny, Jason Wei, Shengjia
Zhao, Barret Zoph
Capability evaluations
10
Tyna Eloundou, Shengli Hu, Roger Jiang, Jamie Kiros, Teddy Lee,
Scott Mayer McKinney, Jakub Pachocki, Alex Paino, Giambattista
Parascandolo, Boris Power, Raul Puri, Jack Rae, Nick Ryder, Ted
Sanders, Szymon Sidor, Benjamin Sokolowsky, Chelsea Voss, Alvin
Wang, Rowan Zellers, Juntang Zhuang
Coding evaluations
10
Ilge Akkaya, Mo Bavarian, Jonathan Gordon, Shawn Jain, Chak Li,
Oleg Murk, Vitchyr Pong, Benjamin Sokolowsky, Jerry Tworek, Kevin
Yu, Wojciech Zaremba
Real-world use case evaluations
10
Andrew Kondrich, Joe Palermo, Boris Power, Ted Sanders
Contamination investigations
10
Adrien Ecoffet, Roger Jiang, Ingmar Kanitscheider, Scott Mayer
McKinney, Alex Paino, Giambattista Parascandolo, Jack Rae, Qiming
Yuan
Instruction following and API evals
10
Diogo Almeida, Carroll Wainwright, Marvin Zhang
Novel capability discovery
10
Filipe de Avila Belbute Peres, Kevin Button, Fotis Chantzis, Mike
Heaton, Wade Hickey, Xin Hu, Andrew Kondrich, Matt Knight, An-
drew Mayne, Jake McNeil, Vinnie Monaco, Joe Palermo, Joel Parish,
Boris Power, Bob Rotsted, Ted Sanders
Vision evaluations
10
Shixiang Shane Gu, Shengli Hu, Jamie Kiros, Hyeonwoo Noh, Raul
Puri, Rowan Zellers
Economic impact evaluation
10
Tyna Eloundou, Sam Manning, Aalok Mehta, Pamela Mishkin
Non-proliferation, international humanitarian law & national
security red teaming
10
Sarah Shoker
Overreliance analysis
10
Miles Brundage, Michael Lampe, Pamela Mishkin
Privacy and PII evaluations
10
Michael Lampe, Vinnie Monaco, Ashley Pantuliano
Safety and policy evaluations
10
Josh Achiam, Sandhini Agarwal, Lama Ahmad, Jeff Belgum, Tyna
Eloundou, Johannes Heidecke, Shengli Hu, Joost Huizinga, Jamie
Kiros, Gretchen Krueger, Michael Lampe, Stephanie Lin, Ryan Lowe,
Todor Markov, Vinnie Monaco, Tong Mu, Raul Puri, Girish Sastry,
Andrea Vallone, Carroll Wainwright, CJ Weinmann, Lilian Weng, Kai
Xiao, Chong Zhang
OpenAI adversarial testers
10
Josh Achiam, Steven Adler, Lama Ahmad, Shyamal Anadkat, Red
Avila, Gabriel Bernadett-Shapiro, Anna-Luisa Brakman, Tim Brooks,
Miles Brundage, Chelsea Carlson, Derek Chen, Hyung Won Chung,
Jeremiah Currier, Daniel Kokotajlo, David Dohan, Adrien Ecoffet,
Juston Forte, Vik Goel, Ryan Greene, Johannes Heidecke, Alan Hickey,
Shengli Hu, Joost Huizinga, Janko, Tomer Kaftan, Ali Kamali, Nitish
Shirish Keskar, Tabarak Khan, Hendrik Kirchner, Daniel Kokotajlo,
Gretchen Krueger, Michael Lampe, Teddy Lee, Molly Lin, Ryan
Lowe, Todor Markov, Jake McNeil, Pamela Mishkin, Vinnie Monaco,
Daniel Mossing, Tong Mu, Oleg Murk, Cullen O’Keefe, Joe Palermo,
Giambattista Parascandolo, Joel Parish, Boris Power, Alethea Power,
Cameron Raymond, Francis Real, Bob Rotsted, Mario Salterelli, Sam
Wolrich, Ted Sanders, Girish Sastry, Sarah Shoker, Shyamal Anadkat,
Yang Song, Natalie Staudacher, Madeleine Thompson, Elizabeth
Tseng, Chelsea Voss, Jason Wei, Chong Zhang
System card & broader impacts analysis
10
Steven Adler, Sandhini Agarwal, Lama Ahmad, Janko Altenschmidt,
Jeff Belgum, Gabriel Bernadett-Shapiro, Miles Brundage, Derek Chen,
16
Tyna Eloundou, Liam Fedus, Leo Gao, Vik Goel, Johannes Heidecke,
Alan Hickey, Shengli Hu, Joost Huizinga, Daniel Kokotajlo, Gretchen
Krueger, Michael Lampe, Jade Leung, Stephanie Lin, Ryan Lowe,
Kim Malfacini, Todor Markov, Bianca Martin, Aalok Mehta, Pamela
Mishkin, Tong Mu, Richard Ngo, Cullen O’Keefe, Joel Parish, Rai
Pokorny, Bob Rotsted, Girish Sastry, Sarah Shoker, Andrea Vallone,
Carroll Wainwright, CJ Weinmann, Lilian Weng, Dave Willner, Kai
Xiao, Chong Zhang
Deployment
Core contributors
10
Steven Adler Early stage program management lead
Sandhini Agarwal Launch safety lead
Derek Chen Monitoring & response lead
Atty Eleti GPT-4 API co-lead
Joanne Jang GPT-4 product co-lead
Angela Jiang GPT-4 product co-lead
Tomer Kaftan Inference infrastructure & deployment lead
Rachel Lim GPT-4 API co-lead
Kim Malfacini Usage policy lead
Bianca Martin Release program management lead
Evan Morikawa Engineering lead
Henrique Ponde de Oliveira Pinto Inference workflow lead
Heather Schmidt GPT-4 infrastructure management
Maddie Simens Design lead
Felipe Such Inference optimization & reliability lead
Andrea Vallone Detection & refusals policy lead
Lilian Weng Applied research lead
Dave Willner Trust & safety lead
Michael Wu Inference research lead
Inference research
10
Paul Baltescu, Scott Gray, Yuchen He, Arvind Neelakantan, Michael
Wu
GPT-4 API & ChatML deployment
10
Greg Brockman, Brooke Chan, Chester Cho, Atty Eleti, Rachel Lim,
Andrew Peng, Michelle Pokrass, Sherwin Wu
GPT-4 web experience
10
Valerie Balcom, Lenny Bogdonoff, Jason Chen, Dave Cummings,
Noah Deutsch, Mike Heaton, Paul McMillan, Rajeev Nayak, Joel
Parish, Adam Perelman, Eric Sigler, Nick Turley, Arun Vijayvergiya,
Chelsea Voss
Inference infrastructure
10
Brooke Chan, Scott Gray, Chris Hallacy, Kenny Hsu, Tomer Kaftan,
Rachel Lim, Henrique Ponde de Oliveira Pinto, Raul Puri, Heather
Schmidt, Felipe Such
Reliability engineering
10
Haiming Bao, Madelaine Boyd, Ben Chess, Damien Deville, Yufei
Guo, Vishal Kuo, Ikai Lan, Michelle Pokrass, Carl Ross, David
Schnurr, Jordan Sitkin, Felipe Such
Trust & safety engineering
10
Jeff Belgum, Madelaine Boyd, Vik Goel
Trust & safety monitoring and response
10
Janko Altenschmidt, Anna-Luisa Brakman, Derek Chen, Florencia
Leoni Aleman, Molly Lin, Cameron Raymond, CJ Weinmann, Dave
Willner, Samuel Wolrich
Trust & safety policy
10
Rosie Campbell, Kim Malfacini, Andrea Vallone, Dave Willner
Deployment compute
10
Peter Hoeschele, Evan Morikawa
Product management
10
Jeff Harris, Joanne Jang, Angela Jiang
Additional contributions
Sam Altman, Katie Mayer, Bob McGrew, Mira Murati, Ilya Sutskever,
Peter Welinder
10
Blog post & paper content
10
Sandhini Agarwal, Greg Brockman, Miles Brundage, Adrien Ecoffet,
Tyna Eloundou, David Farhi, Johannes Heidecke, Shengli Hu, Joost
Huizinga, Roger Jiang, Gretchen Krueger, Jan Leike, Daniel Levy,
Stephanie Lin, Ryan Lowe, Tong Mu, Hyeonwoo Noh, Jakub Pa-
chocki, Jack Rae, Kendra Rimbach, Shibani Santurkar, Szymon Sidor,
Benjamin Sokolowsky, Jie Tang, Chelsea Voss, Kai Xiao, Rowan
Zellers, Chong Zhang, Marvin Zhang
Communications
10
Ruby Chen, Cory Decareaux, Thomas Degry, Steve Dowling, Niko
Felix, Elie Georges, Anna Makanju, Andrew Mayne, Aalok Mehta,
Elizabeth Proehl, Kendra Rimbach, Natalie Summers, Justin Jay Wang,
Hannah Wong
Compute allocation support
10
Theresa Lopez, Elizabeth Tseng
Contracting, revenue, pricing, & finance support
10
Brooke Chan, Denny Jin, Billie Jonn, Patricia Lue, Kyla Sheppard,
Lauren Workman
Launch partners & product operations
10
Filipe de Avila Belbute Peres, Brittany Carey, Simón Posada Fishman,
Isabella Fulford, Teddy Lee„ Yaniv Markovski, Tolly Powell, Toki
Sherbakov, Jessica Shieh, Natalie Staudacher, Preston Tuggle
Legal
10
Jake Berdine, Che Chang, Sheila Dunning, Ashley Pantuliano
Security & privacy engineering
10
Kevin Button, Fotis Chantzis, Wade Hickey, Xin Hu, Shino Jomoto,
Matt Knight, Jake McNeil, Vinnie Monaco, Joel Parish, Bob Rotsted
System administration & on-call support
10
Morgan Grafstein, Francis Real, Mario Saltarelli
We also acknowledge and thank every OpenAI team member not explicitly mentioned above,
including the amazing people on the executive assistant, finance, go to market, human resources,
legal, operations and recruiting teams. From hiring everyone in the company, to making sure we have
an amazing office space, to building the administrative, HR, legal, and financial structures that allow
us to do our best work, everyone at OpenAI has contributed to GPT-4.
We thank Microsoft for their partnership, especially Microsoft Azure for supporting model
training with infrastructure design and management, and the Microsoft Bing team and Microsoft’s
safety teams for their partnership on safe deployment.
We are grateful to our expert adversarial testers and red teamers who helped test our mod-
els at early stages of development and informed our risk assessments as well as the System Card
output. Participation in this red teaming process is not an endorsement of the deployment plans
10
All author lists sorted alphabetically.
17
of OpenAI or OpenAI’s policies: Steven Basart, Sophie Duba, Cèsar Ferri, Heather Frase, Gavin
Hartnett, Jake J. Hecla, Dan Hendrycks, Jose Hernandez-Orallo, Alice Hunsberger, Rajiv W.
Jain, Boru Gollo Jattani, Lauren Kahn, Dan Kaszeta, Sara Kingsley, Noam Kolt, Nathan Labenz,
Eric Liddick, Andrew J. Lohn, Andrew MacPherson, Sam Manning, Mantas Mazeika, Anna
Mills, Yael Moros, Jimin Mun, Aviv Ovadya, Roya Pakzad, Yifan Peng, Ciel Qi, Alex Rosenblatt,
Paul Röttger, Maarten Sap, Wout Schellaert, Geoge Shih, Muhammad Shoker, Melanie Subbiah,
Bryan West, Andrew D. White, Anna Katariina Wisakanto, Akhila Yerukola, Lexin Zhou, Xuhui Zhou
We thank our collaborators at Casetext and Stanford CodeX for conducting the simulated
bar exam: P. Arredondo (Casetext/Stanford CodeX), D. Katz (Stanford CodeX), M. Bommarito
(Stanford CodeX), S. Gao (Casetext).
GPT-4 was used for help with wording, formatting, and styling throughout this work.
References
[1]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal,
Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are
few-shot learners. Advances in Neural Information Processing Systems, 33:1877–1901, 2020.
[2]
Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza
Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, et al.
Training compute-optimal large language models. arXiv preprint arXiv:2203.15556, 2022.
[3]
Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam
Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, et al. PaLM:
Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311, 2022.
[4]
Jack W Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song,
John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, et al. Scaling language
models: Methods, analysis & insights from training gopher. arXiv preprint arXiv:2112.11446,
2021.
[5]
Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V. Le, and Ruslan Salakhutdinov.
Transformer-XL: Attentive language models beyond a fixed-length context. arXiv preprint
arXiv:1901.02860, 2019.
[6]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike
Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized bert pretraining
approach. arXiv preprint arXiv:1907.11692, 2019.
[7]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of
deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805,
2018.
[8]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena,
Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified
text-to-text transformer. arXiv preprint arXiv:1910.10683, 2019.
[9]
Noam Shazeer and Mitchell Stern. Adafactor: Adaptive learning rates with sublinear memory
cost. arXiv preprint arXiv:1804.04235, 2018.
[10]
Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. Layer normalization. arXiv preprint
arXiv:1607.06450, 2016.
[11]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Chi, Quoc Le, and Denny
Zhou. Chain-of-thought prompting elicits reasoning in large language models. NeurIPS, 2022.
[12]
Jiaxin Huang, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, and Jiawei
Han. Large language models can self-improve. arXiv preprint arXiv:2210.11610, 2022.
[13]
Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. Large
language models are zero-shot reasoners. arXiv preprint arXiv:2205.11916, 2022.
18
[14]
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child,
Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language
models. arXiv preprint arXiv:2001.08361, 2020.
[15]
Tom Henighan, Jared Kaplan, Mor Katz, Mark Chen, Christopher Hesse, Jacob Jackson,
Heewoo Jun, Tom B. Brown, Prafulla Dhariwal, Scott Gray, et al. Scaling laws for autoregressive
generative modeling. arXiv preprint arXiv:2010.14701, 2020.
[16]
Greg Yang, Edward J. Hu, Igor Babuschkin, Szymon Sidor, Xiaodong Liu, David Farhi, Nick
Ryder, Jakub Pachocki, Weizhu Chen, and Jianfeng Gao. Tensor programs v: Tuning large
neural networks via zero-shot hyperparameter transfer. arXiv preprint arXiv:2203.03466, 2022.
[17]
Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton,
and Jeff Dean. Outrageously large neural networks: The sparsely-gated mixture-of-experts
layer. arXiv preprint arXiv:1701.06538, 2017.
[18]
Barret Zoph, Irwan Bello, Sameer Kumar, Nan Du, Yanping Huang, Jeff Dean, Noam Shazeer,
and William Fedus. ST-MoE: Designing stable and transferable sparse expert models. arXiv
preprint arXiv:2202.08906, 2022.
[19]
Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani
Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, et al. Emergent abilities of large
language models. TMLR, 2022.
[20]
Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, and Lukasz Kaiser. Uni-
versal transformers. In International Conference on Learning Representations, 2019. URL
https://openreview.net/forum?id=HyzdRiR9Y7.
[21]
Jianlin Su, Yu Lu, Shengfeng Pan, Ahmed Murtadha, Bo Wen, and Yunfeng Liu. Roformer:
Enhanced transformer with rotary position embedding. arXiv preprint arXiv:2104.09864, 2021.
[22]
Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson,
Karel Lenc, Arthur Mensch, Katherine Millican, Malcolm Reynolds, et al. Flamingo: a visual
language model for few-shot learning. In Advances in Neural Information Processing Systems.
[23]
Xi Chen, Xiao Wang, Soravit Changpinyo, AJ Piergiovanni, Piotr Padlewski, Daniel Salz,
Sebastian Goodman, Adam Grycner, Basil Mustafa, Lucas Beyer, et al. PaLI: A jointly-scaled
multilingual language-image model. arXiv preprint arXiv:2209.06794, 2022.
[24]
Ben Wang and Aran Komatsuzaki. Gpt-j-6b: A 6 billion parameter autoregressive language
model, 2021.
[25]
Sid Black, Leo Gao, Phil Wang, Connor Leahy, and Stella Biderman. Gpt-neo: Large scale
autoregressive language modeling with mesh-tensorflow. If you use this software, please cite it
using these metadata, 58, 2021.
[26]
Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ili
´
c, Daniel Hesslow,
Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, et al. Bloom: A
176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100,
2022.
[27]
Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen,
Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, et al. Opt: Open pre-trained
transformer language models. arXiv preprint arXiv:2205.01068, 2022.
[28] Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timo-
thée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open
and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
[29]
Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and
Jacob Steinhardt. Measuring massive multitask language understanding. Proceedings of the
International Conference on Learning Representations (ICLR), 2021.
19
剩余98页未读,继续阅读
2023-03-25 上传
2023-06-06 上传
2023-06-06 上传
2023-06-13 上传
点击了解资源详情
2023-03-15 上传
未来在这儿
- 粉丝: 4779
- 资源: 264
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- ITE Embedded Controller
- 2009年3月二级VF真题
- MAPGIS7.0二次开发教程入门篇
- Introduction to the IP Multimedia Subsystem
- MAPGIS7.0二次开发教程基础篇
- QTP自动化测试指导(中文官方文档)
- 09年3月二级C语言真题及答案
- Ubuntu linux 命令大全 Ubuntu技巧.txt
- Beej's Socket网络编程指南.pdf
- TCP/IP 标准6
- jsp第一阶段试卷,涉及JSP语法,内置对象及HTML编程
- PowerCenter服务器配置手记
- GNU make中文手册
- RFC-3261官方中文版
- VIM用户手册中文版
- FTP建站与配置完全手册详解之高级设置
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功