研究

MLlib: Apache Spark中的机器学习

作者:祥瑞孟,约瑟夫·布拉德利,Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, DB Tsai, Manish Amde, Sean Owen, Doris Xin, Reynold Xin, Michael J. Franklin, Reza Zadeh, Matei Zaharia, Ameet Talwalkar

下载论文

摘要

Apache Spark是一个流行的用于大规模数据处理的开源平台,非常适合迭bob体育客户端下载代机器学习任务。在本文中,我们介绍了Spark的开源分布式机器学习库MLlib。MLlib为广泛的学习设置提供了高效的功能,并包括几个基本的统计、优化和线性代数原语。随Spark一起提供的MLlib支持多种语言,并提供了一个高级API,利用Spark丰富的生态系统来简化端到端机器学习管道的开发。MLlib由于其充满活力的开源社区(超过140个贡献者)而经历了快速增长,并包括广泛的文档来支持进一步的增长,并让用户快速跟上速度。

相关内容

作者:Andrew Chen, Andy Chow, Aaron Davidson, Arjun DCunha, Ali Ghodsi, Sue Ann Hong, Andy Konwinski, Clemens Mewald, Siddharth Murching, Tomas Nykodym, Paul Ogilvie, Mani Parkhe, Avesh Singh, Fen Xie, Matei Zaharia, Richard Zang, Juntai郑俊泰,Corey Zumar, Databricks, Inc.

作者:Matei Zaharia, Andrew Chen, Aaron Davidson, Ali Ghodsi, Sue Ann Hong, Andy Konwinski, Siddharth Murching, Tomas Nykodym, Paul Ogilvie, Mani Parkhe, Fen Xie, Corey Zumar, Databricks Inc.

作者:Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Yang Zongheng, William Paul, Michael I. Jordan和Ion Stoica, UC Berkeley

作者:Roy Fox, Richard Shin, Sanjay Krishnan, Ken Goldberg, Dawn Song, Ion Stoica

作者:Firas Abuzaid, Joseph Bradley, Feynman Liang, Andrew Feng, Lee Yang, Matei Zaharia, Ameet Talwalkar

作者:Cody Coleman, Deepak Narayanan, Daniel Kang,赵田,张健,Luigi Nardi, Peter Bailis, Kunle Olukotun, Chris Ré, Matei Zaharia

作者:Daniel Crankshaw, Wang Xin, Giulio Zhou, Michael J. Franklin, Joseph E. Gonzalez, Ion Stoica

作者:Reza Bosagh Zadeh,向瑞孟,Alexander Ulanov, Burak Yavuz, Li Pu, Shivaram Venkataraman, Evan Sparks, Aaron staples, Matei Zaharia

作者:Eric Liang, Richard Liaw, Philipp Moritz, Robert Nishihara, Roy Fox, Ken Goldberg, Joseph E. Gonzalez, Michael I. Jordan, Ion Stoica