Big Data - Actionable Insights

Big Data - Actionable Insights

Organizations are seeking to derive meaningful insights from diverse data sets. Quoin has the experience and engineering skills to harness Big Data – from defining the architecture to building a system that supports the aggregation, analysis, and generation of metadata that can drive actions.

Big Data - Actionable Insights

Organizations are seeking to derive meaningful insights from diverse data sets. Quoin has the experience and engineering skills to harness Big Data – from defining the architecture to building a system that supports the aggregation, analysis, and generation of metadata that can drive actions.
Languages
Java, C/C++, Python, Scala
Frameworks
Agile, Scrum, Lean, Continuous Integration, Test-driven Development
Operating Systems
Linux, Mac OSX
Client Projects
PublicRelay, Envolve/LiveHealthier, Scholastic, Lowe's Company

Big Data is a broad discipline that encompasses system architecture, technologies for aggregation, persistence of complex data sets, and computing practices for deriving additional information. In other words, the discipline is focused on generating useful and actionable metadata. Big data practices and technologies have been applied to financial services, health care, communications, intelligence, and many other fields. This ability to analyze information at a large scale has been driven by widespread adoption of the Internet and mobile devices to deliver software functionality; and, will continue to produce complex data sets that can be leveraged for improved products and services.

Quoin has both the project experience and engineering skills to support a client embarking on a big data project. We have built sophisticated software platforms for content aggregation, analysis, and derivation of metadata. Our development experience includes significant projects in publishing, ecommerce, health services, and media monitoring.

For example, we built a 'segmentation engine' that analyzes diverse health data about an employee and assigned individuals to 'segments' for incentives, programs, or offers that target improved health. Our project team implemented this engine using a system architecture based on Apache Spark, Hadoop, and the Scala language. Another project required the development of platform for aggregation of traditional and social media. The amount of content analyzed is substantial – the system evaluates over 30 million articles and social media posts per month, and ingests approximately 2.5 million for further analysis. The content is then subject to both human analysis and quantitative metrics for attributes such as importance, level of interest, sentiment, and others.

 Our organization has the technical skills for any big data project; we understand;

  • System architecture for aggregation and analytics
  • Content aggregation, mash-ups
  • Handling unstructured data and cleansing
  • Algorithms for data analytics and machine learning
  • Non-relational data persistence

The figure below illustrates a generalized reference architecture for big data that is comprised of core data management, content/data sources, and end-user applications that access the analytics. Enterprise systems typically include operational, reporting, and analytics data. Each type of data has specific system dependencies and relevant technologies. Although a big data project would use non-relational data store such as Hadoop, operational and reporting databases will often use conventional, relational technologies. This hybrid of relational and non-relational technologies is required to support both legacy systems and disparate types of data.

Generalized Big Data Architecture

The sources include both content and system sources. Content and data sources could include traditional media outlets, social media platforms (e.g., Twitter), specialized content feeds, and data publishers. These sources are usually external to an organization and its enterprise systems. The data and content is ingested by the system, supported by complex transformation rules and processes. System sources include the range of transactional and other enterprise systems. These sources are typically the existing applications built on operational databases.

The aggregated content is then subject to analysis, using one or more techniques, categorized as the following.

  • Extraction
  • Classification
  • Segmentation
  • Curation

The result of this system workflow are then available for end-users. Typical applications include business intelligence (BI), analytics, search, visualization, and reporint applicaitons.