"高通量基因组学:Apache Spark让数据处理触手可及"

需积分: 5 0 下载量 29 浏览量 更新于2024-03-14 收藏 9.64MB PDF 举报
The presentation "High-throughput Genomics at Your Fingertips with Apache Spark" at the Spark Summit EU 2016 in Brussels showcased the journey of KeyGene, a company specializing in genomics, into utilizing Apache Spark for analyzing genomics data. The speaker, Erwin Datema, emphasized that while he was not a computer scientist or a data scientist, he was able to effectively use Spark for interactive genomics data processing and querying. KeyGene's goal in adopting Apache Spark was to enable faster and more efficient analysis of large-scale genomics data sets. The presentation was told from a user's perspective, providing valuable insights into how Spark can be leveraged by domain experts in genomics to accelerate their research. The speaker highlighted the importance of high-throughput genomics in advancing biological research and how Apache Spark has transformed the way genomics data is analyzed. By utilizing Spark's distributed computing capabilities, KeyGene was able to significantly reduce the time and resources required for analyzing complex genomics data sets. Overall, the presentation underscored the tremendous potential of Apache Spark in revolutionizing genomics research and how it has empowered scientists like Erwin Datema to perform high-throughput genomics analysis at their fingertips. Through KeyGene's successful adoption of Spark, it has become clear that this technology is not just for computer scientists or data scientists but can be effectively utilized by domain experts in various fields to unlock new insights and drive innovation in their research.