NEW YORK—Why do companies struggle with Big Data and why is Ashush Thusoo, founder and CEO at cloud-scale data processing Qubole, concerned about it? The answer is obvious: Big Data gives you competitive advantage if companies can manage it; unfortunately, not all the time. It has been reported that only 27 percent of Big Data initiatives are classified as successful in 2014.
NEW YORK— The Data Driven meetup has always been an effective mix of show-and-tell demos and fireside chats with its guests. Last September 27, New York’s most well-attended meetup held its most inspired event this year with its impressive lineup of guests, packing every inch of the cavernous 480-seater AXA Equitable Center. A four-panel group of VCs from Silicon Valley talked candidly about building businesses around artificial intelligence while other speakers talked about the new things they are doing in their companies.
Steady host Matt Turck of First Mark Capital interviewed the VCs Jeff Chung, managing director at AME Cloud Ventures, Mike Dauber, general partner at Amplify Partners; Jake Flomenberg, partner at Accel and Aditya Singh, partner at Foundation Capital.
“Winners of tomorrow will be because AI was behind their product,” Singh said.
The adoption stage is early, but Singh believes “customers want solutions, not individual pieces,” he added while emphasizing how his company “helps you get customers and establish product-market fit.”
Flomenberg, for his part, thinks “we have a loose definition of AI.” He sees potential in computer vision.
Dauber thinks AI is real—if only the hype just finds the right mix but then there’s Google. “Google is who I am worried about. I think they can beat us senseless. On top of that, he thinks “access to (second round) capital is not easy,” he said.
Chung looks forward to having medical records scanned that leverages big data.
The healthcare industry is an endless curiosity for VCs, but Dauber probably put it best, “Healthcare is the most exciting and terrifying vertical,” adding how it faces so many regulations.
Even if money flows to startups in the artificial intelligence space, Dauber thinks “technical people are hard to find” to expedite any development.
Chung agrees: “It’s a challenge if you don’t have a strong foundational team. ‘It is a challenge whether you are here or in San Francisco. Many are coming from academia”
The summer hiatus certainly did the Data-Driven Meetup some good as it offered more interesting presentations.
Other guests were Noah Weiss, head of Search, Learning, & Intelligence at Slack; Praveen Murugesan, engineering manager at Uber and Jeremy Stanley, VP Data Science at Instacart (the one-hour grocery delivery platform). Weiss talked about Uber’s beginnings, how it unfurled from IRC chat, text messaging and Facebook. And lest everyone has forgotten, it made its start as a game.
“Macro trends plus the shift to mobile (formed) into a perfect storm,” Weiss said.
Now Slack is looking into addressing the increasing volume of communication by making people focus on the conversation they really need to read. Categorizing messages in terms of priority as well as having a fully “indexable” searc should help someone catch up with a team if he missed a day or two.
Carlos Guestrin, Amazon professor of Machine Learning at the University of Washington, and founder and CEO of Turi (a machine learning startup recently bought by Apple) also had a great presentation along with Kostas Tzoumas, founder and CEO of Data Artisans (a company implementing Apache Flink, stream data processing).
With Instacart, Stanley talked about how its 100 staff works to make sure it delivers within 60 minutes as it tries to capture its 600-million market with its product and retail partnerships. “Delivering orders really matters….(It’s) critical for customer happiness,” adding how it has achieved profitable unit economics driven in part by huge decreases in fulfillment time.
How does Uber operate in 75 countries and 500 cities? Murugesan credits its thousands of city operators; on-the-ground team who run and scale its transportation network and hundreds of data scientists and analysts as well as its engineering teams.
“We do A/B experimentations, spend analysis, build automated data applications,” he said, adding it has a scalable ingestion model – homegrown streaming ingestion solution and Hadoop Data Lake (no more limits to storage).
Guestrin exclaims, “Machine learning is hot, but can you trust it. How do we know they’re working? “You deploy a model and do A/B testing.”
He used Netflix as an example and how we trust its AI system.
NEW YORK—Last July 18, HUI Central featured Clarifai, the three-year old artificial intelligence company that focuses on visual recognition and solving real-world problems for businesses and developers in its midtown East office.
What problems? Imagine having hundreds of images but tagging each one of them on your site? That would be too much of a chore. Clarifai does the tagging for you when you upload them—automatically.
Presenter Cassidy Williams showed Clarifai’s powerful image and video recognition technology, built on machine learning systems and made available to developers via a clean API. Williams showed how the technology works using “convolution neural networks.” It reportedly improves its image recognition capability with consistent use.
Williams compared convolution to adjacent by saying the former is fast to train and can find multiple items whereas the latter offers no recognition of special structure but is good for finding a single item. Both, she said, creates a multilayer neural network.
What are convolution neutral networks? Deeplearning.net defines it “as biologically-inspired variants of MLPs. From Hubel and Wiesel’s early work on the cat’s visual cortex, the visual cortex contains a complex arrangement of cells. These cells are sensitive to small sub-regions of the visual field, called a receptive field. The sub-regions are tiled to cover the entire visual field. These cells act as local filters over the input space and are well-suited to exploit the strong spatially local correlation present in natural images.
“Additionally, two basic cell types have been identified: Simple cells respond maximally to specific edge-like patterns within their receptive field. Complex cells have larger receptive fields and are locally invariant to the exact position of the pattern.
The animal visual cortex being the most powerful visual processing system in existence, it seems natural to emulate its behavior. Hence, many neurally-inspired models can be found in the literature.”
Today, big companies are confident how deep learning can handle large data sets plus have greater computing power. It’s a game changer for AI prototyping. Not only that, it can serve as a boon for advertisers trying to pinpoint better use and even best timing for any use of photo or videos.
Clarifai has both a REST API that could be integrated with your preferred language along with a Python, Java and Node.js API. For more info, visit developer.clarifai.com or github.com/clarifai
NEW YORK–”Why is AI (artificial intelligence) stuck?” asked Gary Marcus of Geometric Intelligence.”Because it has fallen in love with statistics and big data.” He was showing how, in so many ways, AI is not where we thought it would be by now. For example, one would expect translation online by now to be more precise but not, really. Quoting Peter Thiel, he also said: “We wanted flying cars instead we got 140 characters,” in reference to Twitter, of course.
Marcus was at the Data-Driven meetup last January 19 at Bloomberg. Marcus, a scientist, bestselling author and entrepreneur, had the crowd of data scientists, developers and business intelligence analysts chuckling along with his funny yet whip-smart and practical insights. He is also professor of psychology and neural science at NYU.
The other presenters were Amir Orad, CEO of Sisense, which handles business intelligence for complex data; Shivon Zilis, investor at Bloomberg Beta, an early-stage VC firm; and Dan Scholnick, general partner at Trinity Ventures, a VC firm based in Silicon Valley.
Started 8 years ago, Orad likes to say how Sisense came about because of 5 data geeks who met in university and who wanted to make business intelligence understandable, cost-efficient and accurate,” adding how “the more complex your data the more you spend.”
Sisense is bringing disruptive simplicity for big data or multi-source data. He run a list of things the company is looking into: DBA to build database’ defining what data will be queried; joining tables upfront; normalizing and creating a star schema.
What lessons have they learned at Sisense? “Dream big. Refine benefits. Don’t automate, obliterate Disrupt, don’t improve. Be totally different, that’s the only way to offer value,” he said.
“Speed is not the end game but beginning of something else,” he added.
Shivon Zilis of Bloomberg Beta gave us updates on the companies that the venture capital fund is investing on–hundreds of them that she certainly had no time to explain but show, slide after slide, the logos of many recognizable names. She termed it an “explosion of activity” with “startups focusing on niches that provide immediate value”
In all these investments, Zilis listed the following what-if scenarios that we certainly hoped can be solved: what if I had the same support as a Fortune 400 CEO?; what if I never had to feel lonely again; what if I never had to go to a primary care physician; what if I could measure the effectiveness of every word I said? what if I never had to drive again?
Some realistic expectation includes how in five years, it will be crazy for a farmer to overwater their fields or how in five years it will be crazy to ever hit “publish “without using a domain specific text optimizer, one that makes you smarter even when you’re not using it.
It was also good to hear Scholnick of Trinity Ventures say that his VC firm doesn’t outsource work to junior staff, which have become important for startups looking to reach the decision makers right away.
As for hiring, he advised startups to make sure they’re hiring people with the right experience