2018年2月的Apache TLP+Incubator列表

最近,我经常听说有些刚听闻的开源项目其实很久以前就已经在Apache上公开了!为了应对这种情况,我决定整理一下Apache TLP和Incubator的列表。

请参考2018年2月发布的一个整理项目,包含了几乎未被人了解的10个最新的Apache项目,对于IoT非常有用。

1. Apache 软件是什么

由Apacheソフトウェア財団管理的开源软件被称为Apache软件。该软件许可证比较宽松,允许以商业非公开利用派生产品,条件限制也存在一定的可能性,因此软件的增加非常活跃。在90年代,Apache代表Web服务器,但近年来,它在大数据相关软件中,如Apache Hadoop等方面非常有竞争力。

1.1. “Incubator”是什麼意思?

申请成为Apache软件的开源项目,并满足一定的条件后,可以被允许使用Apache的名称,并作为孵化器(Incubator)的实验项目进行注册,从而获得支持。(也称之为Podling)

在一定的条件下,孵化器项目会中断或毕业。在毕业时,它可以成为过去的Apache TLP子项目,或者成为一个新的TLP。

1.2. TLP是指什么?

TLP(顶级项目)是以专门的PMC(项目管理委员会)为核心的委员会组织,作为Apache正式项目运行。

2. TLP列表

根据Apache项目的数据,目前大约有170个顶级项目。在2000年代,JAVA相关项目较为主流,而在2010年代则以大数据相关项目为主导。

有些委员会(TLP)下面托管了多个开源项目。另外,也有一些TLP没有托管任何开源项目。需要注意的是,每个TLP都有自己定义的分类,因此有很多未设置的分类,并不一定与实际水平相符。

我尝试按照TLP晋升年份的降序进行排列。在2014年以后的项目中,最近的热门项目时常可见。

CommitteeCategoryDescriptionEstablishedApache Trafodionbig-datawebscale SQL-on-Hadoop solution enabling transactional or operational workloads.2017年12月Apache Guacamolenetworkproviding performant, browser-based remote access2017年11月Apache Impala
a high-performance distributed SQL engine2017年11月Apache Mnemonic
a transparent nonvolatile hybrid memory oriented library for Big data, High-performance computing, and Analytics2017年11月Apache Juneau
a toolkit for marshalling POJOs to a wide variety of content types using a common framework, and for creating sophisticated self-documenting REST interfaces and microservices using VERY little code2017年10月Apache Kibble
an interactive project activity analyzer and aggregator2017年10月Apache PredictionIObig-dataa machine learning server built on top of state-of-the-art open source stack, that enables developers to manage and deploy production-ready predictive services for various kinds of machine learning tasks2017年10月Apache DRAT
large scale code license analysis, auditing and reporting2017年9月Apache RocketMQ
a fast, low latency, reliable, scalable, distributed, easy to use message-oriented middleware, especially for processing large amounts of streaming data2017年9月Apache Royale
improving developer productivity in creating applications for wherever Javascript runs (and other runtimes)2017年9月Apache Fluo
Storage and incremental processing of large data sets2017年7月Apache MADlib
Scalable, Big Data, SQL-driven machine learning framework for Data Scientists2017年7月Apache Streams
interoperability of online profiles and activity feeds2017年7月Apache Atlas
scalable and extensible set of core foundational governance services2017年6月Apache Mynewt
embedded OS optimized for networking and built for remote management of constrained devices2017年6月Apache SystemML
A machine learning platform optimal for big data2017年5月Apache CarbonDatabig-dataindexed columnar data format for fast analytics on big data platform2017年4月Apache Fineract
Platform for Digital Financial Services2017年4月Apache Metron
Real-time big data security2017年4月Apache Ranger
framework to enable, monitor and manage comprehensive data security across the Hadoop platform.2017年1月Apache Beambig-dataProgramming model, SDKs, and runners for defining and executing data processing pipelines2016年12月Apache Eagle
open source analytics solution for identifying security and performance issues instantly on big data platforms2016年12月Apache Geode
Low latency, high concurrency data management solutions2016年11月Apache Kudu
A distributed columnar storage engine built for the Apache Hadoop ecosystem2016年7月Apache Twill
Use Apache Hadoop YARN’s distributed capabilities with a programming model that is similar to running threads2016年6月Apache Bahir
Extensions to distributed analytic platforms such as Apache Spark2016年5月Apache TinkerPop
A graph computing framework for both graph databases (OLTP) and graph analytic systems (OLAP)2016年5月Apache Zeppelinbig-dataA web-based notebook that enables interactive data analytics2016年5月Apache Apexbig-dataEnterprise-grade unified stream and batch processing engine2016年4月Apache AsterixDB
open source Big Data Management System2016年4月Apache Johnzon
JSR-353 compliant JSON parsing; modules to help with JSR-353 as well as JSR-374 and JSR-3672016年4月Apache Sentry
Fine grained authorization to data and metadata in Apache Hadoop2016年3月Apache Arrow
Powering Columnar In-Memory Analytics2016年1月Apache BrooklyncloudFramework for modeling, monitoring, and managing applications through autonomic blueprints2015年11月Apache GroovylibraryA multi-faceted language for the Java platform2015年11月Apache Kylin
Extreme OLAP Engine for Big Data2015年11月Apache REEFbig-dataRetainable Evaluator Execution Framework2015年11月Apache Calcitebig-data, hadoop, sqlDynamic data management framework2015年10月Apache Yetusbuild-management, library, testingCollection of libraries and tools that enable contribution and release processes for software projects2015年9月Apache Ignitebig-data, cloud, data-management-platform, database, distributed-sql-database, hadoop, iot, osgi, sqlHigh-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time2015年8月Apache Lensbig-dataUnified analytics platform2015年8月Apache SerflibraryHigh performance C-based HTTP client library built upon the Apache Portable Runtime (APR) library2015年8月Apache Usergrid
The BaaS Framework you run2015年8月Apache NiFi
Easy to use, powerful, and reliable system to process and distribute data2015年7月Apache WhimsycontentTools that help automate various administrative tasks or information lookup activities2015年5月Apache ORCbig-data, database, hadoop, librarythe smallest, fastest columnar storage for Hadoop workloads2015年4月Apache Parquetbig-datacolumnar storage format available to any project in the Apache Hadoop ecosystem2015年4月Apache Aurora
Mesos framework for long-running services and cron jobs2015年3月Apache Polygenelibrarycommunity based effort exploring Composite Oriented Programming for domain centric application development2015年3月Apache Samzabig-datadistributed stream processing framework2015年1月Apache Falconbig-dataData management and processing platform.2014年12月Apache Flinkbig-dataplatform for scalable batch and stream data processing2014年12月Apache BookKeeperbig-dataReplicated log service which can be used to build replicated state machines2014年11月Apache Drillbig-dataSchema-free SQL Query Engine for Apache Hadoop, NoSQL and Cloud Storage2014年11月Apache MetaModelbig-data, database, librarycommon interface for discovery, exploration of metadata and querying of different types of data sources2014年11月Apache Stormbig-dataDistributed, real-time computation system2014年9月Apache CelixnetworkImplementation of the OSGi specification adapted to C2014年7月Apache Tezbig-dataHigh-performance and scalable distributed data processing framework2014年7月Apache VXQuerybig-data, xmlA parallel XQuery processor2014年7月Apache Phoenixbig-data, databaseHigh performance relational database layer over Apache HBase for low latency applications2014年5月Apache AlluracontentForge software for hosting software projects2014年3月Apache OlingolibraryOASIS OData protocol libraries2014年3月Apache Tajobig-dataBig data warehouse system on Apache Hadoop2014年3月Apache Knoxbig-dataSimplify and normalize the deployment and implementation of secure Hadoop clusters2014年2月Apache Open Climate WorkbenchcontentClimate model evaluation2014年2月Apache Sparkbig-dataFast and general engine for large-scale data processing2014年2月Apache Helixbig-data, cloudA cluster management framework for partitioned and replicated distributed resources2013年12月Apache Ambaribig-dataHadoop cluster management2013年11月Apache Marmotta
An Open Platform for Linked Data2013年11月Apache Chukwa
Open source data collection system for monitoring large distributed systems.2013年10月Apache jcloudscloud, libraryJava cloud APIs and abstractions2013年10月Apache Curatordatabase, libraryJava libraries that make using Apache ZooKeeper easier2013年9月Apache JSPWikicontentLeading open source WikiWiki engine, feature-rich and built around standard J2EE components (Java, servlets, JSP).2013年7月Apache Mesosclouda cluster manager that provides efficient resource isolation and sharing across distributed applications2013年6月Apache DeltaSpikejavaeePortable CDI extensions that provide useful features for Java application developers2013年4月Apache Bloodhoundbuild-managementIssue tracking, wiki and repository browser2013年3月Apache CloudStackcloudInfrastructure as a Service solution2013年3月Apache cTAKEScontentNatural language processing (NLP) tool for information extraction from electronic medical record clinical free-text2013年3月Apache Clerezzacontent, osgiSemantically linked data for OSGi2013年2月Apache Crunchbig-data, librarySimple and Efficient MapReduce Pipelines2013年2月Apache OltulibraryOAuth protocol implementation in Java2013年1月Apache OpenMeetingsnetworkOpenMeetings: Web-Conferencing and real-time collaboration2013年1月Apache Flexweb-frameworkApplication framework for expressive web applications that deploy to all major browsers, desktops and devices.2012年12月Apache Kafkabig-dataDistributed publish-subscribe messaging system2012年11月Apache Syncopeidentity, securityManaging digital identities in enterprise environments2012年11月Apache Cordovalibrary, mobilePlatform for building native mobile applications using HTML, CSS and JavaScript2012年10月Apache Isisweb-frameworkFramework for rapidly developing domain-driven apps in Java2012年10月Apache OpenOfficecontentAn open-source, office-document productivity suite2012年10月Apache Airavatabig-data, cloud, networkWorkflow and Computational Job Management Middleware2012年9月Apache Bigtopbig-dataApache Hadoop ecosystem integration and distribution project2012年9月Apache SISlibrarySpatial Information System2012年9月Apache StanbolcontentReusable components for semantic content management2012年9月Apache Any23contentAnything to Triples2012年8月Apache Lucene.NetdatabaseSearch engine library targeted at .NET runtime users.2012年8月Apache Ooziebig-dataA workflow scheduler system to manage Apache Hadoop jobs.2012年8月Apache StevelibraryApache’s Python based single transferable vote software system2012年7月Apache Flumebig-dataA reliable service for efficiently collecting, aggregating, and moving large amounts of log data2012年6月Apache VCLcloudVirtual Computing Lab2012年6月Apache Giraphbig-dataIterative graph processing system built for high scalability2012年5月Apache Hamabig-dataa Bulk Synchronous Parallel computing framework on top of Apache Hadoop2012年5月Apache ManifoldCFcontentFramework for connecting source content repositories to target repositories or indexes.2012年5月Apache Creadur
Comprehension and auditing of software distributions2012年4月Apache JenalibraryJava framework for building Semantic Web applications2012年4月Apache AccumulodatabaseSorted, distributed key/value store2012年3月Apache LucydatabaseSearch engine library for dynamic languages2012年3月Apache Sqoopbig-dataBulk Data Transfer for Apache Hadoop and Structured Datastores2012年3月Apache Bvaljavaee, libraryApache BVal: JSR-303 Bean Validation Implementation and Extensions2012年2月Apache OpenNLPlibraryMachine learning based toolkit for the processing of natural language text2012年2月Apache Empire-dbdatabaseRelational Data Persistence2012年1月Apache GoradatabaseORM framework for column stores such as Apache HBase and Apache Cassandra with a specific focus on Hadoop2012年1月Apache JMetertestingJava performance and functional testing2011年10月Apache Libcloudcloud, libraryUnified interface to the cloud2011年5月Apache ChemistrylibraryCMIS (Content Managment Interoperability Services) Clients and Servers2011年2月Apache RiverjavaeeJini service oriented architecture2011年1月Apache ArieslibraryEnterprise OSGi application programming model2010年12月Apache OODTweb-frameworkObject Oriented Data Technology (middleware metadata)2010年11月Apache ZooKeeperdatabaseCentralized service for maintaining configuration information2010年11月Apache Thrifthttp, library, networkFramework for scalable cross-language services development2010年10月Apache HivedatabaseData warehouse infrastructure using the Apache Hadoop Database2010年9月Apache PigdatabasePlatform for analyzing large data sets2010年9月Apache Shirolibrary, web-frameworkPowerful and easy-to-use application security framework2010年9月Apache jUDDI
Java implementation of the Universal Description, Discovery, and Integration specification2010年8月Apache Karafosgi, networkServer-side OSGi distribution2010年6月Apache Avrobig-data, libraryA Serialization System2010年4月Apache HBasedatabaseApache Hadoop Database2010年4月Apache MahoutlibraryScalable machine learning library2010年4月Apache Nutchweb-frameworkOpen Source Web Search Software2010年4月Apache TikalibraryContent Analysis and Detection Toolkit2010年4月Apache Traffic ServerhttpA fast, scalable and extensible HTTP/1.1 compliant caching proxy server2010年4月Apache UIMA
Framework and annotators for unstructured information analysis2010年3月Apache CassandradatabaseHighly scalable second-generation distributed database2010年2月Apache Subversionbuild-managementVersion Control2010年2月Apache Axishttp, network, xmlJava SOAP Engine2009年12月Apache OpenWebBeansjavaeeOpenWebBeans: JSR-299 Context and Dependency Injection for Java EE Platform Implementation2009年12月Apache PivotlibraryRich Internet applications in Java2009年12月Apache Community Development
Resources to help people become involved with Apache projects2009年11月Apache PDFBoxcontent, libraryJava library for working with PDF documents2009年10月Apache Sling
Web Framework for JCR Content Repositories2009年6月Apache Camelnetwork, osgiSpring based Integration Framework which implements the Enterprise Integration Patterns2008年12月Apache Attic
A home for dormant projects2008年11月Apache Buildrbuild-managementSimple and intuitive build system for Java applications2008年11月Apache CouchDBbig-data, cloud, content, database, http, networkRESTful document database2008年11月Apache QpidnetworkMultiple language implementation of the latest Advanced Message Queuing Protocol (AMQP)2008年11月Apache CXFlibrary, network, xmlService Framework2008年4月Apache Archivabuild-managementBuild Artifact Repository Manager2008年3月Apache HadoopdatabaseDistributed computing platform2008年1月Apache Synapsehttp, network, xmlEnterprise Service Bus and Mediation Framework2007年12月Apache HttpComponentshttp, library, networkJava toolset of low level HTTP components2007年11月Apache ServiceMixnetwork, osgi, xmlEnterprise Service Bus2007年9月Apache ODEnetwork, xmlOrchestration Director Engine: Business Process Management (BPM), Process Orchestration and Workflow through service composition.2007年7月Apache Commonshttp, library, networkReusable Java components2007年6月Apache Wicketweb-frameworkComponent-based Java Web Application Framework.2007年6月Apache OpenJPAdatabase, javaee, libraryOpenJPA: Object Relational Mapping for Java2007年5月Apache POIcontent, libraryJava API for OLE 2 Compound and OOXML Documents2007年5月Apache TomEEnetworkJava EE Web Profile built on Apache Tomcat2007年5月Apache Turbineweb-frameworkA Java Servlet Web Application Framework and associated component library2007年5月Apache FelixnetworkOSGi Framework and components2007年3月Apache RollercontentJava blog server2007年2月Apache ActiveMQnetworkDistributed Messaging System2007年1月Apache Cayennedatabase, library, network, web-framework, xmlUser-friendly Java ORM with Tools2006年12月Apache OFBizcontent, database, http, network, web-framework, xmlOpen for Business: enterprise automation software2006年12月Apache Tilesweb-frameworkA templating framework for web application user interfaces2006年12月Apache Labs
A place for innovation where committers of the foundation can experiment with new ideas2006年11月Apache MINAnetworkMultipurpose Infrastructure for Network Application2006年10月Apache VelocitylibraryA Java Templating Engine2006年10月Apache Santuariolibrary, security, xmlXML Security in Java and C++2006年6月Apache Jackrabbitdatabase, library, network, xmlContent Repository for Java2006年3月Apache Tapestryweb-frameworkComponent-based Java Web Application Framework2006年2月Apache Tomcathttp, javaee, networkA Java Servlet and JSP Container2005年5月Apache DirectorynetworkApache Directory Server2005年2月Apache MyFacesjavaee, web-frameworkJavaServer(tm) Faces implementation and components2005年2月Apache XercesxmlXML parsers in Java, C++ and Perl2005年2月Apache Lucenedatabase, library, searchSearch engine library2005年1月Apache XalanxmlXSLT processors in Java and C++2004年10月Apache XML GraphicsgraphicsConversion from XML to graphical output2004年10月Apache SpamAssassinmailMail filter to identify spam2004年6月Apache Forrestbuild-management, database, graphics, http, network, web-framework, xmlAggregated multi-channel documentation, separation of concerns2004年5月Apache Geronimohttp, javaee, network, web-frameworkJava2, Enterprise Edition (J2EE) container2004年5月Apache Strutsweb-frameworkModel 2 framework for building Java web applications2004年3月Apache Gumpbuild-management, testingContinuous integration of open source projects2004年2月Apache Portalsweb-frameworkPortal technology2004年2月Apache Logging Services
Cross-language logging services2003年12月Apache Mavenbuild-managementJava project management and comprehension tools2003年3月Apache Cocoondatabase, graphics, http, network, web-framework, xmlWeb development framework: separation of concerns, component-based2003年1月Apache Jamesmail, networkJava Apache Mail Enterprise Server2003年1月Apache Web Services
Projects related to Web Services2003年1月Apache Antbuild-managementJava-based build tool2002年11月Apache Incubator
Entry path for projects and codebases wishing to become part of the Foundation’s efforts2002年10月Apache DB
Database access2002年7月Apache Portable Runtime (APR)libraryApache Portable Runtime libraries2000年12月Apache Tcl
Dynamic websites using TCL2000年7月Apache mod_perlhttpd-moduleDynamic websites using Perl2000年3月Apache HTTP Serverhttp, httpd-module, networkApache Web Server (httpd)1995年2月

孵化器清单

在Apache孵化器项目中,Incubator将过去的项目综合在内。但在这里,我们只列举目前进行中的50多个项目,并按照它们的开始日期逆序排列。

ProjectDescriptionStart DateCoralCoral is a data processing system to flexibly control the runtime behaviors of a job to adapt to varying deployment characteristics.2018/2/4EChartsECharts is a charting and data visualization library written in JavaScript.2018/1/18PLC4XPLC4X is a set of libraries for communicating with industrial programmable logic controllers (PLCs) using a variety of protocols but with a shared API.2017/12/18SkyWalkingSkywalking is an APM (application performance monitor), especially for microservice, Cloud Native and container-based architecture systems. Also known as a distributed tracing system. It provides an automatic way to instrument applications: no need to change any of the source code of the target application; and an collector with an very high efficiency streaming module.2017/12/8ServiceCombServiceComb is a microservice framework that provides a set of tools and components to make development and deployment of cloud applications easier.2017/11/22CrailCrail is a storage platform for sharing performance critical data in distributed data processing jobs at very high speed.2017/11/1SDAPSDAP is an integrated data analytic center for Big Science problems.2017/10/22PageSpeedPageSpeed represents a series of open source technologies to help make the web faster by rewriting web pages to reduce latency and bandwidth.2017/9/30AmaterasuApache Amaterasu is a framework providing continuous deployment for Big Data pipelines.2017/9/7DaffodilApache Daffodil is an implementation of the Data Format Description Language (DFDL) used to convert between fixed format data and XML/JSON.2017/8/27HeronA real-time, distributed, fault-tolerant stream processing engine.2017/6/23LivyLivy is web service that exposes a REST interface for managing long running Apache Spark contexts in your cluster. With Livy, new applications can be built on top of Apache Spark that require fine grained interaction with many Spark contexts.2017/6/5PulsarPulsar is a highly scalable, low latency messaging platform running on commodity hardware. It provides simple pub-sub semantics over topics, guaranteed at-least-once delivery of messages, automatic cursor management for subscribers, and cross-datacenter replication.2017/6/1SupersetSuperset is an enterprise-ready web application for data exploration, data visualization and dashboarding.2017/5/21GobblinGobblin is a distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.2017/2/23MXNetA Flexible and Efficient Library for Deep Learning2017/1/23RatisRatis is a java implementation for RAFT consensus protocol2017/1/3GriffinGriffin is a open source Data Quality solution for distributed data systems at any scale in both streaming or batch data context2016/12/5WeexWeex is a framework for building Mobile cross-platform high performance UI.2016/11/30OpenWhiskdistributed Serverless computing platform2016/11/23NetBeansNetBeans is a development environment, tooling platform and application framework.2016/10/1SpotApache Spot is a platform for network telemetry built on an open data model and Apache Hadoop.2016/9/23HivemallHivemall is a library for machine learning implemented as Hive UDFs/UDAFs/UDTFs.2016/9/13AnnotatorAnnotator provides annotation enabling code for browsers, servers, and humans.2016/8/30AriaToscaARIA TOSCA project offers an easily consumable Software Development Kit(SDK) and a Command Line Interface(CLI) to implement TOSCA(Topology and Orchestration Specification of Cloud Applications) based solutions.2016/8/27SensSoftSensSoft is a software tool usability testing platform2016/7/13Traffic ControlTraffic Control allows you to build a large scale content delivery network using open source.2016/7/12Pony MailPony Mail is a mail-archiving, archive viewing, and interaction service, that can be integrated with many email platforms.2016/5/27GossipGossip is an implementation of the Gossip Protocol.2016/4/28AirflowAirflow is a workflow automation and scheduling system that can be used to author and manage data pipelines.2016/3/31QuickstepQuickstep is a high-performance database engine.2016/3/29OmidOmid is a flexible, reliable, high performant and scalable ACID transactional framework that allows client applications to execute transactions on top of MVCC key/value-based NoSQL datastores (currently Apache HBase) providing Snapshot Isolation guarantees on the accessed data.2016/3/28GearpumpGearpump is a reactive real-time streaming engine based on the micro-service Actor model.2016/3/8TephraTephra is a system for providing globally consistent transactions on top of Apache HBase and other storage engines.2016/3/7EdgentEdgent is a stream processing programming model and lightweight runtime to execute analytics at devices on the edge or at the gateway. (Formerly known as Quarks)2016/2/29JoshuaJoshua is a statistical machine translation toolkit2016/2/13iotaOpen source system that enables the orchestration of IoT devices.2016/1/20MilagroDistributed Cryptography; M-Pin protocol for Identity and Trust2015/12/21ToreeToree provides applications with a mechanism to interactively and remotely access Apache Spark.2015/12/2S2GraphS2Graph is a distributed and scalable OLTP graph database built on Apache HBase to support fast traversal of extremely large graphs.2015/11/29UnomiUnomi is a reference implementation of the OASIS Context Server specification currently being worked on by the OASIS Context Server Technical Committee. It provides a high-performance user profile and event tracking server.2015/10/5RyaRya (pronounced “ree-uh” /rēə/) is a cloud-based RDF triple store that supports SPARQL queries. Rya is a scalable RDF data management system built on top of Accumulo. Rya uses novel storage methods, indexing schemes, and query processing techniques that scale to billions of triples across multiple nodes. Rya provides fast and easy access to the data through SPARQL, a conventional query mechanism for RDF data.2015/9/18HAWQHAWQ is an advanced enterprise SQL on Hadoop analytic engine built around a robust and high-performance massively-parallel processing (MPP) SQL framework evolved from Pivotal Greenplum Database.2015/9/4FreeMarkerFreeMarker is a template engine, i.e. a generic tool to generate text output based on templates. FreeMarker is implemented in Java as a class library for programmers.2015/7/1SINGASINGA is a distributed deep learning platform.2015/3/17MyriadMyriad enables co-existence of Apache Hadoop YARN and Apache Mesos together on the same cluster and allows dynamic resource allocations across both Hadoop and other applications running on the same physical data center infrastructure.2015/3/1SAMOASAMOA provides a collection of distributed streaming algorithms for the most common data mining and machine learning tasks such as classification, clustering, and regression, as well as programming abstractions to develop new algorithms that run on top of distributed stream processing engines (DSPEs). It features a pluggable architecture that allows it to run on several DSPEs such as Apache Storm, Apache S4, and Apache Samza.2014/12/15TamayaTamaya is a highly flexible configuration solution based on an modular, extensible and injectable key/value based design, which should provide a minimal but extendible modern and functional API leveraging SE, ME and EE environments.2014/11/14HTraceHTrace is a tracing framework intended for use with distributed systems written in java.2014/11/11TavernaTaverna is a domain-independent suite of tools used to design and execute data-driven workflows.2014/10/20SliderSlider is a collection of tools and technologies to package, deploy, and manage long running applications on Apache Hadoop YARN clusters.2014/4/29DataFuDataFu provides a collection of Hadoop MapReduce jobs and functions in higher level languages based on it to perform data analysis. It provides functions for common statistics tasks (e.g. quantiles, sampling), PageRank, stream sessionization, and set and bag operations. DataFu also provides Hadoop jobs for incremental data processing in MapReduce.2014/1/5BatchEEBatchEE projects aims to provide a JBatch implementation (aka JSR352) and a set of useful extensions for this specification.2013/10/3ODF ToolkitJava modules that allow programmatic creation, scanning and manipulation of OpenDocument Format (ISO/IEC 26300 == ODF) documents2011/8/1
广告
将在 10 秒后关闭
bannerAds