Panelists talk about rise of alternative data; Foursquare gets more sophisticated with Pilgrim

jeff glueck of foursquare

By Dennis Clemente

Guests panelists talked about alternative data with Foursquare, Captricity presenting their companies

NEW YORK–Last March 1, the Data Driven meetup hosted by Matt Turck sat down with his guests to talk about about alternative data (no relation to alternative facts).

The alternative discussion consisted of Jeff Glueck, CEO of Foursquare; David Loaiza, managing director & chief data scientist of Point72; Andrej Rusakov, founder of Data Capital Management as well Matei Zatreanu, founder of System2.

Zatreanu explained alternative data as a non-traditional form of data, later adding how it’s more intuitive. Still,  many seem to be downplaying its advantages.

What are the uses of alternative data? Before handing you a credit card, banks could determine other alternative means of data if usual information is not available. This could certainly make institutions less rigid, as it helps measure different types of businesses on a case-by-case basis.

Continue reading “Panelists talk about rise of alternative data; Foursquare gets more sophisticated with Pilgrim”

Analyzing data and extracting insights from IBM Watson Analytics

NEW YORK— Last November 10, the meetup called Data Science & Analytics for Communications Industry showed us how IBM Watson Analytics is making it easier for business professionals to analyze data and extract insights for businesses across data intensive disciplines, including marketing (social media and networks), sales, operations, finance and human resources.

Host Rachel Wells showed us how Watson Analytics works a smart-data discovery tool with guided data exploration, automated predictive analysis, dashboard creation and visualization service. It is designed to help different professionals — from salespersons to company CEOs – find patterns and pursue ideas for their business.

In collaboration with industry partners, its new data discovery models called Expert Storybooks is aimed at helping guide users on how to understand, learn and reason with different types of data sources to surface the most relevant facts and uncover patterns and relationships for predictive decision making. Examples of the types of Storybooks IBM will make available are as follows:

  • AriBall – a Storybook that will help users analyze the performance of baseball players to build predictions about player performance that they can use to gain an edge in their fantasy lineup.
  • Deloitte – a Storybook that measures the effectiveness of incentive programs to help sales leadership determine how and when to effectively deploy short term incentives for revenue uplift.
  • The Weather Company – a Storybook that helps users incorporate weather data into their revenue analysis to understand how weather is impacting their business.
  • OgilvyOne – a Storybook that shows users how to analyze marketing campaign data while integrating disparate data points such as weather information to bring creative inputs into campaign planning.
  • Twitter – a Storybook that helps users analyze social media data from Twitter to measure reputational risk, and also get a better understanding about how social sentiment could reveal drivers behind fluctuations in stock prices in real time.
  • American Marketing Association – a Storybook that helps users identify and analyze the key drivers of customer profitability.
  • Nucleus Research – a Storybook that enables users to benchmark projects for return on investment (ROI) and to project expected returns for proposed technology projects based on Nucleus Research data from more than 500 ROI case studies.
  • MarketShare – a Storybook that helps users achieve a clear understanding of how their investment strategy compares to industry standards, as well as a view into how to optimize investments across online and offline media channels such as TV, paid search, digital display, online video, radio, print, and others.
  • Intangent – a Storybook that will help finance managers examine the relationships between pay, performance, and credit risk in lending to better align incentive compensation with risk taking.

Instead of fumbling over data, searching for answers or testing hypotheses, the Watson Analytics user can focus on understanding the business and effectively communicating results to stakeholders. Business users often struggle figuring out what analysis would be relevant and how to tell the story in a report or diagram. Watson Analytics automates these steps to accelerate users’ ability to get to the answers quickly and on their own.

As users interact with the results, they can fine-tune their questions and the data to surface the most relevant facts and uncover unforeseen patterns and relationships, which will enable predictive decision making for all levels of users.

However, Wells is quick to point out the differences between IBM Watson, which means the whole process of reasoning which does its full A to Z job, while IBM Watson Analytics is all about helping anyone explore data easily.

Facebook shows roadmap to AI, Qubole addresses Big Data’s low success rate

NEW YORK—Why do companies struggle with Big Data and why is Ashush Thusoo, founder and CEO at cloud-scale data processing Qubole, concerned about it? The answer is obvious: Big Data gives you competitive advantage if companies can manage it; unfortunately, not all the time. It has been reported that only 27 percent of Big Data initiatives are classified as successful in 2014.

‘Winners of tomorrow will have AI,’ says VC

NEW YORK— The Data Driven meetup has always been an effective mix of show-and-tell demos and fireside chats with its guests. Last September 27, New York’s most well-attended meetup held its most inspired event this year with its impressive lineup of guests, packing every inch of the cavernous 480-seater AXA Equitable Center. A four-panel group of VCs from Silicon Valley talked candidly about building businesses around artificial intelligence while other speakers talked about the new things they are doing in their companies.

Steady host Matt Turck of First Mark Capital interviewed the VCs Jeff Chung, managing director at AME Cloud Ventures, Mike Dauber, general partner at Amplify Partners; Jake Flomenberg, partner at Accel and Aditya Singh, partner at Foundation Capital.

“Winners of tomorrow will be because AI was behind their product,” Singh said.

The adoption stage is early, but Singh believes “customers want solutions, not individual pieces,” he added while emphasizing how his company “helps you get customers and establish product-market fit.”

Flomenberg, for his part, thinks “we have a loose definition of AI.” He sees potential in computer vision.

Dauber thinks AI is real—if only the hype just finds the right mix but then there’s Google. “Google is who I am worried about. I think they can beat us senseless. On top of that, he thinks “access to (second round) capital is not easy,” he said.

Chung looks forward to having medical records scanned that leverages big data.

The healthcare industry is an endless curiosity for VCs, but Dauber probably put it best, “Healthcare is the most exciting and terrifying vertical,” adding how it faces so many regulations.

Even if money flows to startups in the artificial intelligence space, Dauber thinks “technical people are hard to find” to expedite any development.

Chung agrees: “It’s a challenge if you don’t have a strong foundational team. ‘It is a challenge whether you are here or in San Francisco. Many are coming from academia”

The summer hiatus certainly did the Data-Driven Meetup some good as it offered more interesting presentations.

Other guests were Noah Weiss, head of Search, Learning, & Intelligence at Slack; Praveen Murugesan, engineering manager at Uber and Jeremy Stanley, VP Data Science at Instacart (the one-hour grocery delivery platform). Weiss talked about Uber’s beginnings, how it unfurled from IRC chat, text messaging and Facebook. And lest everyone has forgotten, it made its start as a game.

“Macro trends plus the shift to mobile (formed) into a perfect storm,” Weiss said.

Now Slack is looking into addressing the increasing volume of communication by making people focus on the conversation they really need to read. Categorizing messages in terms of priority as well as having a fully “indexable” searc should help someone catch up with a team if he missed a day or two.

Carlos Guestrin, Amazon professor of Machine Learning at the University of Washington, and founder and CEO of Turi (a machine learning startup recently bought by Apple) also had a great presentation along with Kostas Tzoumas, founder and CEO of Data Artisans (a company implementing Apache Flink, stream data processing).

With Instacart, Stanley talked about how its 100 staff works to make sure it delivers within 60 minutes as it tries to capture its 600-million market with its product and retail partnerships. “Delivering orders really matters….(It’s) critical for customer happiness,” adding how it has achieved profitable unit economics driven in part by huge decreases in fulfillment time.

How does Uber operate in 75 countries and 500 cities? Murugesan credits its thousands of city operators; on-the-ground team who run and scale its transportation network and hundreds of data scientists and analysts as well as its engineering teams.

“We do A/B experimentations, spend analysis, build automated data applications,” he said, adding it has a scalable ingestion model – homegrown streaming ingestion solution and Hadoop Data Lake (no more limits to storage).

Guestrin exclaims, “Machine learning is hot, but can you trust it. How do we know they’re working? “You deploy a model and do A/B testing.”

He used Netflix as an example and how we trust its AI system.

Legacy technologies stall plague big companies

NEW YORK— Rigid architectures, maintenance gaps and lack of updates have plagued many big companies, and may have affected an airline’s recent issue when flights were delayed or canceled. This has put a spotlight on legacy technologies – the topic at Tech in Motion’s meetup last September 22 at the West Monroe Partners offices.

The speakers shared their thoughts on legacy technologies.

Melanie Colton, VP of Product & Technology for Hearst, has transformed and helped to grow Hearst Magazines’ digital business through the delivery of the core content and distribution platform.  She has built and modernized Hearst’s approach to product development and technical execution, as well as grown a fledgling organization into a fully functioning product and technology department.

Colton said you do inherit a system when you move from one company to another. When do legacy technologies need to retire? “If it (does) not meet actual business needs of the company (or) things need to rebuilt in a smarter way,” she said. “It was at that moment (we thought) we needed to start over.”

One time, Colton asked questions about old systems in another company she worked for, she was surprised to hear from the employees, “We don’t know (it) but we don’t touch it. No one here knows how to make it work.” Typically, she said, “You don’t understand your system you’re afraid to touch.”

Other points she raised about the problems of legacy technologies:

  • Leveraging open source tech is good
  • Listen to the whole problem and not just the individual components
  • (There’s ) fear and lack of investing
  • Some systems are expensive to rebuild
  • Organizational issues

Rafael Schloming, CTO and Chief Architect of Datawire, is a globally recognized expert on messaging. Previously, Schloming was a principal software engineer at Red Hat, where he led Red Hat’s technical engagements with the AMQP community. Rafael has a B.S. in computer science from MIT.

Before taking down legacy technologies, Schloming suggests looking for a good fit to solve your problem, and to evaluate the costs.

“(You don’t want to) put a square peg on a round hole,” he said.

A brittle system will not be any good. When that happens, more central governance is needed.

Joe Mongiat, a senior technical architect in West Monroe Partners’ Technology Integration practice, has more than 12 years of experience managing, designing, and implementing technology solutions that help clients realize a variety of business and operational benefits—from scalability and expansion to user productivity to information visibility. “If you think ahead of time, you can avoid costs,” he said.

Look at your organization, what tips the scales—the process and change management? How do you know if you can keep work internally?

“If you build something you have to pay to maintain it. There’s always a balancing act,” he said, adding how important it is to really have a clear vision and road map.

The cloud as commodity, just like your utility bill

NEW YORK–What if you could pay for usage of your Internet and your mobile phone data the way you use your gas or electricity bill?  At the TheoRise event at Rise last July 15, Nich Chung of Paper and Soap at Rise hosted a talk on cloud as community with panelists Tim Martin, COO of  Universal Compute; Jon Finkel, head trader and managing partner of Landscape Capital; Jack Thorburn, COO of Global Commodities.

The topic is timely as TheoRise theorizes how The Cloud could be treated (and traded) like a commodity or utility as innovation and investment slows, and chasing competitive parity becomes the norm among the Apples, Amazons and Googles of the world.

If not, many prophesy that what is termed “internet bankruptcy. “(Many companies) would not know how to pay for this,” he said.

Developers can use up more space, because the company they work for is growing and innovating; the more important it is then to have the cloud work as a commodity.

In our world of on-demand and open-source, TheoRise said approaching IT as a commodity would not only be a cost-effective measure, it could help pave the way for a truly neutral, universally accessible Internet. Even developing countries who can’t afford cloud could benefit substantially from it. “It will be advantageous for bit players,” Finkel said.

“(The cloud) would be like a metering company,” Martin said.

It’s a new way of thinking that will usher in ways of measuring services in the cloud.

It’s already happening. Martin said businesses can now measure their cloud consumption. When you plug your electric appliances, you don’t know where that power is coming from. With IT, there is a sense of security as much as there are ways to get data.

The cloud is a collection of physical assets. Its computing power will be key resources for a company, creating a transparency, according to Thorburn.

Commodity markets trading a contract in cloud services is not far-fetched.

The meetup used instant feedback and data from Remesh.

Taking an intelligence augmentation journey and harnessing stream data

NEW YORK—At the AXA Center last June 13, the Data-Driven meetup featured speakers Christopher Nguyen, founder and CEO of Arimo (data intelligence platform); Adam Kanouse, CTO of Narrative Science (transforms data into meaningful and insightful narratives); Neha Narkhede, founder and CTO of Confluent (real-time data platform built around Apache Kafka) and Nitay Joffe, founder and CTO of Action IQ (next-generation data platform for marketing and consumer data).

Nguyen talked about his company Arimo, formerly Adatao, which started building its platform with pAnalytics and Apache Spark. It was designed to be a comfortable working environment for data scientists and engineers, who work in familiar tools such R, Python, SQL and Java.

Business and data science users can collaborate and drive high-quality decisions via Arimo’s predictive applications and tools to companies with fast insights to critical business questions on big data. It is reportedly easy to use and deploy as it runs on existing SQL databases, Hadoop and cloud data sources. Businesses generate value from big data in minutes, not weeks or months. An API is provided for data scientists and engineers to work with the data and build applications on top of the data that expose it to the business users in a more friendly way. This means they can make better use of it and build charts and graphs on the fly in real time to suit their needs.

To that end, the business layer called pInsights lets end users query the data using natural language queries. The  system learns from the data what types of data users are likely to ask about, and even learns as users query to provide an as you type drop-down capability with likely queries as you would get in Google search as you enter a search term.

“We are on an intelligence augmentation journey,” Nguyen said.

Kanouse’s Narrative Science is focused on creating software starting with advanced language generation platform, Quill, which transforms data into narratives people can read, so people will spend less time crunching numbers. ‘Presenting analysis takes 3 to 4 hours of your time.”

Confluent, founded by the creators of Apache Kafka, enables organizations to harness business value from stream data. The Confluent Platform manages the barrage of stream data and makes it available throughout an organization.

It provides various industries, from retail, logistics and manufacturing, to financial services and online social networking, a scalable, unified, real-time data pipeline that enables applications ranging from large volume data integration to big data analysis with Hadoop to real-time stream processing.

Narkhede presented Confluent, emphasizing how Kafka Connect does the hard work

  1. Scale out
  2. Fault tolerance
  3. Central Management
  4. Schema propagation

What is stream processing? It Isn’t necessarily transient, approximate or “lossy”?

Its developers have reportedly contributed more than 76 percent of all the open source Apache Kafka code, and built some of the largest production deployments in the world.

The company recently introduced Confluent Control center which examines their data to understand message delivery, possible bottlenecks, and observe the end-to-end deliverability of messages in their native Kafka environment. Its Control Center UI allows operators to connect new data sources to the cluster and configure new data sources connectors.

Apache Kafka is widely adopted for use cases ranging from collecting user activity data, logs, application metrics, stock ticker data and device instrumentation.

Its key strength is its ability to make high volume data available as a real-time stream for consumption in systems with very different requirements — from batch systems like Hadoop, to real-time systems that require low-latency access, to stream processing engines that transform the data streams as they arrive.

Apache Kafka is key data infrastructure that can serve as a single central nervous system, transmitting messages to all the different systems and applications within an organization.

Funded by leading investors including Sequoia Capital and First Mark Capital,Joffe’s ActionIQ is a stealth startup building the next generation of the marketing technology stack focused on customer data and business users.

Action IQ has designed it to make it look like playing with legos.

“It can be more costly to hire new customers as it is to retain an existing one,” Joffe said, who said he hires as much Data PHDs as he does UX designers.

Data as complete, clean, contextual, consumable information

NEW YORK—When we’re drowning in data but still thirsting for information, what does that say about data’s role out there? Prakash Nanduri, founder and CEO of Paxata, thinks there is a way for business users and information workers to understand data more as information: “when it’s complete, clean, contextual and consumable.”

http://www.meetup.com/DataDrivenNYC/events/229759269/

Nanduri was at the Data-Driven meetup last April 11 at AXA Equitable Center with three other data companies,

Haoyuan Li, CEO of Alluxio on next generation storage; Florian Douetteau, founder and CEO of Dataiku on its data science platform and Sri Ambati, founder and CEO of H2O.ai, with its machine learning API for smarter applications.

With its self-service data preparation software, Nanduri shows us how it works to make information out of data in various phases. From its presentation of visual guidance and library of tools to help everyone make education assumptions, Paxata gives you the tools and guides you proactively with the raw or messy data based on your history of data preparation.

From its library, Paxata recommends improvements based on crowdsourced answers. Lastly, it automatically transforms data for immediate consumption as it continuously learns from user interactions. A visual paradigm, he said, is created.

Can you make a data analyst and data engineer work together? Dataiku’s Doutteau thinks two mindsets can co-exist– the clickers and coders.  “You have to make those two work together.”

Dataiku is the developer of DSS, the integrated development platform for data professionals to turn raw data into predictions. The new integrated visual environment in DSS3 includes a dedicated production node feature that solves the problem of development environments typically disconnected and incompatible with production environments.

One can now deploy, test and roll-back instance of data applications in the data engineering process, which permits the team to build, run and improve data products.

Haoyuan Li, CEO of Alluxio (formerly Tachyon) flew from California to talk about its memory speed virtual distributed storage, with its memory-centric architecture designed for memory i/0.  

Renamed a month ago, Li talked about how Alluxio has come a long way from the time it started in summer 2012 at the University of Berkeley AMPLab to the time it became open source in 2013 to the company’s deployment in 100 companies. It has raised $7.5 million from Andreessen Horowitz, the leading VC firm based in Silicon Valley..

“We power up your workloads,” he said, citing how Baidu queries data 30 times faster (now). “We enable new workloads across storage systems. We work with frameworks of your choices and scale storage and compute independently.”

Ambati of H20 said the company scales statistics, machine learning and math over Big Data. It develops predictive analysis applications for such as tasks as detecting fraudulent transactions, forecasting online customer purchase, and predicting best time for running ads, among others.

 

German startups MeteoViva, Minodes, Brandnew seek traction, funding in US

german chancellor dirk kanngiesser

By Dennis Clemente

While gaining traction is crucial in the US market, it’s also where German startups can get funding to scale their businesses globally

NEW YORK–The presenters at the German Accelerator at Rise last March 22 had one thing in mind. They know the US market is big. While gaining traction is crucial in the US market, it’s also where they know they can get funding to scale their businesses globally.

German startups MeteoViva, Minodes and Brandnew presented their startups to a panel of venture capitalists–Urs Cete, managing partner at BDMI; Ulrich Quay, head at BMW Ventures; Alicia Syrett, founder and CEO at Pantegrion Capital; and  Anton Waltz, managing director of US Digital Ventures.  

meteovivaMeteoViva helps the customers save 15 to 40 percent of energy costs in corporate buildings with its unique Saas solution. Its technology is reportedly based on a patented computer simulation model and was developed at RWTH Aachen University (Germany’s MIT).

It essentially predicts how much heating and cooling a building to maintain the desired climate at the lowest cost. It can reportedly be used in any building–factories, office buildings, shopping centers.

For retail analytics, Minodes offers insights into visitors’ in-store shopping behavior using Wi-Fi sensors it puts on stores. Data is viewable in its dashboard and in customizable email reports. Additional and more granular reports are provided depending on business requirements

As it optimizes in-store customer pathways, Minodes also offers omnichannel retargeting and beacon campaigns. For instance, it retargets offline store visitors through Facebook Google Apps. Now in 12 countries, it is in the United States to gain traction and get better valuation.

simple process Offering itself as influencer marketing in its presentation, Brandnew  connects brands with influential users on Instagram. It hopes to address the 3 key frustrations for brands and agencies–scalability, targeting and analytics–through its Saas service, either on subscription or six-month basis. Rates are about $20,000 depending on campaign.

A VC said it needs a “one money-shot sentence” for better positioning.

Dirk Kannigiesser, CEO of the German Accelerator in Silicon Valley, was in attendance to present the startups and VCs along with CleanTech, Berlin’s largest industrial park, which is optimally aligned to the requirements of productive-driven companies.

Platforms that pick your clothes, find the right people

By Dennis Clemente

Meetup showcases platforms that shops for you, picks right people for the job

NEW YORK—The AXA  Equitable Center makes for a grand entrance. Thus said Matt Turck, long-time host of Data Driven, as he welcomed the crowd to its majestic auditorium, complete with velvet curtains and flattering spotlight. One of the most attended meetups in the city, Data Driven is holding its meetups at AXA for a few months until the Bloomberg auditorium finishes its renovation.

At the meetup last March 16, Data Driven divided the talks based on its format. Eric Colson, chief algorithms officer of Stitch Fix; and Kieran Snyder, founder & CEO at Textio presented their companies while Peter Fenton, general partner at Benchmark and Eliot Horowitz, co-founder & CTO at MongoDB sat with Turck to discuss their companies and their industry in general.

Colson opened the night’s data talk with StitchFix. “There is no shopping in our site, because people hate shopping.”  That got people’s attention. What StitchFix does is create your style profile and give you five hand-picked items. You keep what you like and send the rest back.

Recommendations have worked for several companies. For Amazon’s sales, 35 percent; Linkedin’s connection’s, 50 percent; Netflix’s watched movies, 75 percent and StitchFix, 100 percent of its sold merchandise.

Colson said they combine both data and human insights to make StitchFix work. There’s no denying the importance of human insights because of their wealth of experience, according to him.  “(But) they can’t be doing the same things as (its data/algorithms),” he said.

Next presenter Snyder said Textio mines data from recruiters and hiring mangers to find patterns that work, showing how it works to help companies hire better. It was as simple as copy pasting a job posting from a site to a Textio blank field.

Using statistics and machine learning, it analyzes job text and outcomes data using listings from a set of companies.  It makes use of patterns that it finds to predict the performance of job postings and help you fix it before you ever publish it, with analytics and feedback right as you’re typing.  It makes use of color to highlight words (green for phrases that work) and red (for least successful ones) that should help its clients get the talents they need.

It offers real-time feedback as well as sharing and collaboration on job listings with colleagues. On average, Snyder claims that people who use Textio see a 24-percent lift in qualified applicants, a 17-percent drop in time to hire; and a 12-percent increase in underrepresented applicants. “We found words like synergy don’t work among underrepresented applicants,” she said.

Snyder said the best feedback loop comes from its customers, as she also observed how job listings can amplify a company’s voice, throwing wrong assumptions about the lack of creativity of job listings. Expedia is one of its customers.

With MongoDB, Horowitz asked the audience, how many are frustrated with their databases? When MongoDB came into the picture in 2007, it was tackling what is seemingly a persistent problem with databases. In 2009, Salesforce ported their database to MongoDB.

“Developers say (MongoDB) is a pretty good experience,” says Horowitz, adding it looks forward to making users more productive by offering more ways for developers to keep using it.

Addressing a monetization question, Horowitz offered consulting and support, its BI tools and cloud services.

Started in 2007, MongoDB takes pride in having 85 percent of work done in New York.  In 2015, the company released its 3.2 version that helps address a pernickety issue these days—encryption.  It also started a BI connector with Tableau and Compass.

What does it take to be an entrepreneur? Peter Fenton, who invested in Twitter when it was only 25 people, echoed the sentiments of Paul Graham of Y Combinator: “Is the entrepreneur deeply authentic?” He also points out how feeling uneasy can actually work for you, if he means being grounded enough to think of the realities of the startup business.

“Take two those variables and layer around that,” he said.

As for figuring out which is promotional and authentic among the current crop of startups, he describes the tech startup world based on how whales breach and then submerge again. “We’re moving (in a) cycle, but we’re making the ecosystem healthier.”

Fenton pointed how institutional money may have given tech startups longer capital runway and burn rate, but valuations do go down and money may not be as easy to get.

For radical growth, Fenton thinks ubiquity is crucial. However, he points out how some technology has a gestation period (before they hit critical mass).