Panelists talk about rise of alternative data; Foursquare gets more sophisticated with Pilgrim

jeff glueck of foursquare

By Dennis Clemente

Guests panelists talked about alternative data with Foursquare, Captricity presenting their companies

NEW YORK–Last March 1, the Data Driven meetup hosted by Matt Turck sat down with his guests to talk about about alternative data (no relation to alternative facts).

The alternative discussion consisted of Jeff Glueck, CEO of Foursquare; David Loaiza, managing director & chief data scientist of Point72; Andrej Rusakov, founder of Data Capital Management as well Matei Zatreanu, founder of System2.

Zatreanu explained alternative data as a non-traditional form of data, later adding how it’s more intuitive. Still,  many seem to be downplaying its advantages.

What are the uses of alternative data? Before handing you a credit card, banks could determine other alternative means of data if usual information is not available. This could certainly make institutions less rigid, as it helps measure different types of businesses on a case-by-case basis.

Continue reading “Panelists talk about rise of alternative data; Foursquare gets more sophisticated with Pilgrim”

Facebook shows roadmap to AI, Qubole addresses Big Data’s low success rate

NEW YORK—Why do companies struggle with Big Data and why is Ashush Thusoo, founder and CEO at cloud-scale data processing Qubole, concerned about it? The answer is obvious: Big Data gives you competitive advantage if companies can manage it; unfortunately, not all the time. It has been reported that only 27 percent of Big Data initiatives are classified as successful in 2014.

‘Winners of tomorrow will have AI,’ says VC

NEW YORK— The Data Driven meetup has always been an effective mix of show-and-tell demos and fireside chats with its guests. Last September 27, New York’s most well-attended meetup held its most inspired event this year with its impressive lineup of guests, packing every inch of the cavernous 480-seater AXA Equitable Center. A four-panel group of VCs from Silicon Valley talked candidly about building businesses around artificial intelligence while other speakers talked about the new things they are doing in their companies.

Steady host Matt Turck of First Mark Capital interviewed the VCs Jeff Chung, managing director at AME Cloud Ventures, Mike Dauber, general partner at Amplify Partners; Jake Flomenberg, partner at Accel and Aditya Singh, partner at Foundation Capital.

“Winners of tomorrow will be because AI was behind their product,” Singh said.

The adoption stage is early, but Singh believes “customers want solutions, not individual pieces,” he added while emphasizing how his company “helps you get customers and establish product-market fit.”

Flomenberg, for his part, thinks “we have a loose definition of AI.” He sees potential in computer vision.

Dauber thinks AI is real—if only the hype just finds the right mix but then there’s Google. “Google is who I am worried about. I think they can beat us senseless. On top of that, he thinks “access to (second round) capital is not easy,” he said.

Chung looks forward to having medical records scanned that leverages big data.

The healthcare industry is an endless curiosity for VCs, but Dauber probably put it best, “Healthcare is the most exciting and terrifying vertical,” adding how it faces so many regulations.

Even if money flows to startups in the artificial intelligence space, Dauber thinks “technical people are hard to find” to expedite any development.

Chung agrees: “It’s a challenge if you don’t have a strong foundational team. ‘It is a challenge whether you are here or in San Francisco. Many are coming from academia”

The summer hiatus certainly did the Data-Driven Meetup some good as it offered more interesting presentations.

Other guests were Noah Weiss, head of Search, Learning, & Intelligence at Slack; Praveen Murugesan, engineering manager at Uber and Jeremy Stanley, VP Data Science at Instacart (the one-hour grocery delivery platform). Weiss talked about Uber’s beginnings, how it unfurled from IRC chat, text messaging and Facebook. And lest everyone has forgotten, it made its start as a game.

“Macro trends plus the shift to mobile (formed) into a perfect storm,” Weiss said.

Now Slack is looking into addressing the increasing volume of communication by making people focus on the conversation they really need to read. Categorizing messages in terms of priority as well as having a fully “indexable” searc should help someone catch up with a team if he missed a day or two.

Carlos Guestrin, Amazon professor of Machine Learning at the University of Washington, and founder and CEO of Turi (a machine learning startup recently bought by Apple) also had a great presentation along with Kostas Tzoumas, founder and CEO of Data Artisans (a company implementing Apache Flink, stream data processing).

With Instacart, Stanley talked about how its 100 staff works to make sure it delivers within 60 minutes as it tries to capture its 600-million market with its product and retail partnerships. “Delivering orders really matters….(It’s) critical for customer happiness,” adding how it has achieved profitable unit economics driven in part by huge decreases in fulfillment time.

How does Uber operate in 75 countries and 500 cities? Murugesan credits its thousands of city operators; on-the-ground team who run and scale its transportation network and hundreds of data scientists and analysts as well as its engineering teams.

“We do A/B experimentations, spend analysis, build automated data applications,” he said, adding it has a scalable ingestion model – homegrown streaming ingestion solution and Hadoop Data Lake (no more limits to storage).

Guestrin exclaims, “Machine learning is hot, but can you trust it. How do we know they’re working? “You deploy a model and do A/B testing.”

He used Netflix as an example and how we trust its AI system.

Data as complete, clean, contextual, consumable information

NEW YORK—When we’re drowning in data but still thirsting for information, what does that say about data’s role out there? Prakash Nanduri, founder and CEO of Paxata, thinks there is a way for business users and information workers to understand data more as information: “when it’s complete, clean, contextual and consumable.”

http://www.meetup.com/DataDrivenNYC/events/229759269/

Nanduri was at the Data-Driven meetup last April 11 at AXA Equitable Center with three other data companies,

Haoyuan Li, CEO of Alluxio on next generation storage; Florian Douetteau, founder and CEO of Dataiku on its data science platform and Sri Ambati, founder and CEO of H2O.ai, with its machine learning API for smarter applications.

With its self-service data preparation software, Nanduri shows us how it works to make information out of data in various phases. From its presentation of visual guidance and library of tools to help everyone make education assumptions, Paxata gives you the tools and guides you proactively with the raw or messy data based on your history of data preparation.

From its library, Paxata recommends improvements based on crowdsourced answers. Lastly, it automatically transforms data for immediate consumption as it continuously learns from user interactions. A visual paradigm, he said, is created.

Can you make a data analyst and data engineer work together? Dataiku’s Doutteau thinks two mindsets can co-exist– the clickers and coders.  “You have to make those two work together.”

Dataiku is the developer of DSS, the integrated development platform for data professionals to turn raw data into predictions. The new integrated visual environment in DSS3 includes a dedicated production node feature that solves the problem of development environments typically disconnected and incompatible with production environments.

One can now deploy, test and roll-back instance of data applications in the data engineering process, which permits the team to build, run and improve data products.

Haoyuan Li, CEO of Alluxio (formerly Tachyon) flew from California to talk about its memory speed virtual distributed storage, with its memory-centric architecture designed for memory i/0.  

Renamed a month ago, Li talked about how Alluxio has come a long way from the time it started in summer 2012 at the University of Berkeley AMPLab to the time it became open source in 2013 to the company’s deployment in 100 companies. It has raised $7.5 million from Andreessen Horowitz, the leading VC firm based in Silicon Valley..

“We power up your workloads,” he said, citing how Baidu queries data 30 times faster (now). “We enable new workloads across storage systems. We work with frameworks of your choices and scale storage and compute independently.”

Ambati of H20 said the company scales statistics, machine learning and math over Big Data. It develops predictive analysis applications for such as tasks as detecting fraudulent transactions, forecasting online customer purchase, and predicting best time for running ads, among others.