Interviews

Stephen Miller, Co-Founder & SVP of Engineering at Fyusion – Interview Series

Published

4 years ago

October 6, 2021

Antoine Tardif

Stephen Miller is the cofounder and SVP of Engineering at Fyusion, a 3D imaging and computer vision company, part of the Cox Automotive group. Prior to founding Fyusion he was a PhD student at Stanford University studying Computer Science, and worked on personal robotics like laundry folding and surgical knot tying during his undergraduate studies at UC Berkeley. He’s a Google Hertz Fellow, SAP Stanford Graduate Fellow, and NSF Fellow alumnus.

Could you explain what Fyusion is and how it enables the easy capture and visualization of 3d data?

Fyusion is a computer vision company that provides AI-driven, 3D customer experiences. We enable people to capture 3D images via a simple smartphone app that runs on most Android and iOS devices. The app has step-by-step guidance and is designed to be used by anyone, regardless of technical prowess. It takes a minute or two in order to capture the image. From there our AI engine, ALIS, can analyze 3D images and turn that visual data into actionable information. Right now we’re focused on using 3D images to diagnose exterior damage to cars.

Could you explain how the algorithms use the file format .fyuse to enable a smartphone single camera to create 3d images?

I find it helpful to consider the .fyuse format alongside photos and videos. A photo captures a moment in time from a fixed angle, and a video captures a series of those moments in a linear timeline. By contrast, a .fyuse image captures what we like to call “a moment in space.” A viewer isn't confined to a single angle or linear timeline: They can see not only one side of something, but also around it.

To create a .fyuse image, the photographer circles their subject in one direction with a cell phone camera. Alternately, Fyusion technology is also compatible with fixed imaging solutions, and non-traditional imaging solutions such as drones.

Our .fyuse file format is what brings these images to life. It’s lightweight and enables complex, multi-faceted interactivity. It’s also completely compatible with laptops, tablets and smartphones that the everyday user already has in their arsenal.

Could you discuss some of the data that is captured and analyzed with Fyusion?

With cars, ALIS recognizes every part of the vehicle, and then can determine where there is damage, the size and severity of the damage, and eliminate potential false positives, such as dirt kicked up from the road. The technology we’ve developed and patented can solve other problems, but this is the one we’re focused on right now.

Could you discuss what is the AI-based Lightfield Information Suite (ALIS)?

ALIS is the engine behind every Fyusion product. It enables lightweight 3D imaging and deep visual understanding. There are three parts that make up ALIS: Capture, Engine and Viewer. In the Capture module, the mobile application contains built-in tutorials and customizable workflows that allow users to capture high-quality 3D images using the majority of smartphones on the market. Fyusion’s image capture also supports DSLRs, drones and a host of other devices.

In the second step, Engine, ALIS analyzes those 3D images and turns them into actionable information, such as the types of damages required by our customers. It can also provide backup for its findings by creating high-resolution 2D images of the damages it finds.

Lastly, the Viewer displays the .fyuse file format. The .fyuse is patented and lightweight, and provides an immersive 3D experience with fast load times. We are able to append all sorts of experiences with a .fyuse, including audio, video, and of course 2D images.

Fyusion is both AR and VR ready, how big do you believe these applications will be in the future?

Augmented reality is a billion-dollar industry that is becoming more mainstream, and it’s even easier to capture surroundings in 3D thanks to powerful new mobile devices and low-latency networks. As these technologies move into the mainstream, customer expectations of online experiences will be raised as fast as content creators can keep up.

Especially in the auto industry, with car buying increasingly going online, in the next few years we anticipate a surge of interest in AR, VR, and 3D listings. The goal is to transform a simple vehicle detail page (VDP) into a vehicle experience page (VEP), helping both large and small auto dealers continue to thrive. This can be anything from adding 3D logos and rich media tags onto listings, or allowing shoppers to virtually place a set of golf clubs in the trunk of a car to see how they fit inside.

It will be exciting to see how these types of applications begin to work their way into mainstream use. I don’t think it will be a long wait.

Could you discuss the improvement in clickthrough rate and revenue that is seen in ecommerce from using 3d versus 2d images?

I’m most familiar with wholesale and retail auto sales. 3D imagery has created a new level of trust for online shoppers, which is especially critical with big-ticket items like cars.

Our internal data indicate that 3D images increase user engagement and time spent on vehicle detail pages, which in turn has been shown to increase car sales. Providing a lifelike 3D experience of the vehicle also builds positive sentiment towards the vendor by increasing trust.

One of the options with Fyusion is to process data locally or on the cloud, could you discuss the benefits of each?

Locally, edge AI forces developers to work within considerable constraints, particularly for the use case of mobile phones. In addition to the standard concerns for any AI developer—How optimized is the network? How reliable are the results?—certain practical concerns set clear ceilings. Memory pressure, battery drain, the possibility of your process being backgrounded by the user or operating system, etc. And that’s assuming comparable CPUs and GPUs were available on the edge. Even for flagship devices, this is rarely the case.

You need to plan for every possible corner case; whereas, in the cloud, any solution can be monitored and fine-tuned.

But collectively speaking, edge AI could be considered the perfect “autoscaling” solution: for each new user, you have an entirely new machine at your disposal. If you’ve optimized your network to run entirely on the edge, you can just as easily service two, or two million, clients.

While the beefiest hardware will always exist on cloud, it’s generally accepted that data is king. The more data, and the closer it is to raw, the better. AI on the edge has access to unprocessed, raw input data, with no restrictions. Whereas for a cloud AI solution, input data must either be processed (compressed, partial) or enormous, at which point bandwidth becomes a serious concern.

Because it is closest to the user, Edge AI opens up a range of possibilities Cloud AI doesn’t. If it’s optimized to run in real time, it can provide feedback in real time. Which means you can build solutions which not only ingest data, but encourage users to provide better data.

How will 5G enable rapid growth in computer vision technology applications?

At faster connection speeds you can move more processing to the cloud, which opens up possibilities for all sorts of new computer vision applications. However, it really depends on the application and how widely it will be adopted.

5G could have a fragmented impact and further the digital divide, as some parts of the world have faster and faster connectivity while other areas will continue to have slow connectivity. Applications focused on people with access to 5G will obviously benefit. But more broadly adopted applications may have to choose between spending time and money for what will essentially become two versions of the same application, or sticking with one version that is less robust but can run on nearly any connection.

What steps is Fyusion undertaking to take advantage of future 5G rollout?

I want to preface this by saying that Fyusion has spent considerable time ensuring that customers can access our applications even on old phones with poor bandwidth availability. With Manheim alone our technology has imaged over a million cars, and we wouldn’t have achieved that otherwise.

That said, we’re very excited by what we’re seeing right now–it’s a trifecta of increasing processing speeds, 5G connectivity and nothing short of a revolution in camera phones. Put it all together and you get some new developments I unfortunately can’t share with you yet.

Is there anything else that you would like to share about Fyusion?

It’s a very exciting time to work in computer vision–as a discipline we’re moving into the mainstream after many years of being talked about as a future technology. Fyusion is growing fast and we’re hiring computer vision scientists from all over the world. Our team members can work from anywhere but they are always welcomed at our offices in Potrero Hill.

Thank you for the great interview, readers who wish to learn more should visit Fyusion.

Unite.AI

Stephen Miller, Co-Founder & SVP of Engineering at Fyusion – Interview Series

You may like