On October 16, Shenzhen Camsense Technology Co., Ltd. (hereinafter referred to as: Camsense) released the Camsense XR VR handle kit based on visual and inertial sensing positioning technology, which can realize VR/AR monocular vision 6DoF tracking. It is understood that the company has been deeply involved in the visual field for many years, and the positioning effect of the kit is no less than that of mainstream products such as Oculus.
"Inside-out technology iteration in China, it may be difficult to catch up with Quest in one to two years." Not long after the Quest 2 press conference ended, VR Tuoluo got such an answer while chatting with a practitioner.
From head control to today's 6DoF, the iterative development of VR hardware is driving the rapid evolution of the content ecosystem. The vision-based inside-out tracking system provides the possibility of porting for the popular PC VR games "Beat Saber" and "SuperHot VR", and also promotes the emergence of a large number of mobile game masterpieces with high recognition and innovative gameplay.
It has to be said that in the early stage of the development of the VR industry, the development of the underlying technology is inseparable from the heavy promotion of Facebook, Microsoft, Sony and other big companies. As the standard configuration of today's mobile VR games, the inside-out tracking system has also become the strongest shield in the underlying technology ecosystem of major manufacturers.
In China, the vision-based inside-out tracking system is still in its infancy, and various manufacturers are actively exploring and implementing other tracking solutions based on electromagnetic and ultrasonic that are used in conjunction with vision. The weak problem at the algorithm level has almost become a problem. The biggest stumbling block hindering the development of the domestic inside-out tracking system.
So, is there a domestic company that has a more in-depth exploration of the inside-out visual tracking algorithm?
This time, VR Tuoluo Media came to experience the product, and interviewed Christopher, CEO of Camsense's, to understand the origin and background of the product.
Positioning effect comparable to Quest? Camsense XR achieves monocular vision 6DoF tracking
Although there are many domestic companies engaged in the research and development of VR headsets, there are very few companies focusing on 6DoF handle solutions. In the handle scheme, it can be divided into products with different schemes such as visible light, infrared, laser, and electromagnetic. Each solution has its own advantages and disadvantages. For example, visible light has requirements on the light environment and cannot be captured outside the camera range; laser solutions need to avoid reflective mirrors; electromagnetics are easily interfered by metals, etc.
Different from all the above positioning solutions, Camsense XR adopts a visual tracking + inertial sensing solution, and is one of the very few tracking solutions that supports monocular 6DoF. It is mainly used in the handle positioning and tracking of VR headsets. It combines the head and both ends of the handle to achieve precise positioning, and can be adapted to different headsets.
The whole solution consists of a visual positioning module equipped with two cameras and two 6DoF handles. Since the core of the solution lies in the positioning effect presented by the software algorithm, the industrial design of the handle can be customized according to the manufacturer of the headset, and the camera positioning module. Groups can also be fully integrated into the product when the headset is designed.
In terms of parameters, Camsense XR can achieve 6DoF tracking at a distance of 1 meter, horizontal 170° & vertical 98°. The positioning accuracy is less than 3mm under dynamic conditions, about 1mm under static conditions, and the operation delay is less than 10ms.
From the perspective of mainstream VR headsets on the market, inside-out has become the mainstream tracking solution, and most of them use visual positioning tracking. If it is purely visual positioning, the disadvantage is that it has high requirements for the environment, and it is very easy to lose when the handle is moved out of the capture range, so it needs to be compensated by an algorithm.
The advantage of Camsense XR is that it combines visual tracking and inertial sensing. When the handle is outside the tracking range of the camera, it can be compensated by inertial sensing. At the same time, combined with the algorithm, it can achieve a stable and smooth handle tracking effect. Visual tracking technology, like inertial sensors, has been widely used in other fields and has a high degree of maturity and stability.
The VR gyro tested the performance and effect of the product on the spot, and recorded the effect in real time.
Move the handle quickly within the tracking range of the camera, and hardly feel any delay;
The core of the VR experience lies in immersion and a more instinctive way of interaction, so in some experiences, there will be a large amount of hand movement interaction, such as bowing and archery, throwing grenades, dancing, painting, etc. In these scenes, hand movements are often used. It will jump out of the capture range of the camera. If the handle drifts, is lost, or has high latency and slow feedback, the experience will be greatly reduced, and the game will even directly affect the score. During the actual measurement of Camsense XR, when the arm is swung by a large amount, the handle also appears the moment it returns to the viewing angle. The relocation algorithm can complete the initialization and positioning of the handle posture within 1ms, which is very timely.
In addition to testing the above two parts, VR Tuoluo also field tested another feature of the kit - monocular 6DoF tracking. Generally speaking, just like the human eye, spatial positioning also requires two cameras to detect depth. Can accurate positioning and tracking be performed under the condition of one camera? The answer is yes.
A video shown by Camsense shows that when a camera is blocked, the tracking accuracy, delay and response speed are not affected at all.
After years of deep cultivation, Camsense stubbornly struggles with performance and cost
The reason why such accurate positioning and tracking can be achieved is not only related to the accumulation of Camsense itself, but also to the product definition and design details.
Camsense was established in 2014, focusing on the research and development of high-precision positioning sensors. One of the founders, Christopher, is an undergraduate and a master of Tsinghua University. He has more than ten years of research experience in the field of computer vision spatial positioning and human-computer interaction, and has more than 20 international and national Patent.
Although Camsense has not been established for a long time, the company has applied for 32 invention patents so far, 4 of which have been authorized, of which 1 invention patent has just been authorized by the US Patent Office. Camsense has core algorithms and chips with completely independent intellectual property rights. Under such a team background and confidence, Camsense uses the core algorithm - monocular vision positioning to hard-core the algorithm for the same field, forming a number of business lines, covering robotics, industrial/medical and VR/AR, etc. field. In the field of robotics, its monocular vision lidar based on ASIC chips has been used in many sweeping robot brands. In September, the monthly shipment exceeded 100,000 units, and it has jumped from obscurity to become the industry's leading supplier. . Another product, Camsense M Pro, is used in industrial, medical and other fields. For example, in the UAV test, a positioning system consisting of multiple monocular cameras is used to detect the spatial coordinate trajectory and stability of the UAV flight.
When talking about why Camsense pays attention to VR, Christopher said that he is very optimistic about the development trend of VR/AR. In fact, the company has been paying attention to the VR industry as early as around 2015 and has been making technical reserves.
"A basic premise we adhere to is that we are optimistic about the VR industry for a long time and believe that this industry will develop greatly. This can be seen from the layout actions of manufacturers such as Oculus and Microsoft for the VR industry in recent years. If not In the sunrise industry, these companies will not invest in it.”
After confirming the direction of VR, Camsense focused on the positioning of the controller. Christopher mentioned that among the global VR hardware companies, there are only a handful of companies with monocular 6DoF positioning visual tracking technology. It is also considering that there is a huge market space in the future, so there are more opportunities on this track.
In the design of this scheme, Chriatopher also mentioned many details, such as choosing an active light source instead of a passive light source. “The benefit of active light sources is that they are not affected by the light and shade of the light, nor by the inherent texture complexity of the surrounding environment. Even in a pure white environment, tracking is possible.”
In terms of appearance design, the common annular appearance is adopted, and the overall volume is much smaller than that of similar products. The active light-emitting points on the handle are distributed around the ring, which is convenient for accurate tracking.
In the parameters of the handle, VR Tuoluo noticed that the frame rate of the sensor is 30 frames. Under the background that 60, 70, and 90 frames are the mainstream, why should it be reduced to 30 frames?
"We try to reduce the hardware requirements, improve the algorithm ability of the system, and at the same time can not sacrifice the user experience, in fact, this is very difficult for us." Christopher admitted. Camsense is similar to Oculus and Microsoft's tracking solutions, both have head and hand positioning, and also use monocular sensors and active light handles, but the biggest difference is still the hardware conditions on which the algorithm is processed. Oculus and Microsoft can boldly use high-cost technology and increase hardware and configuration requirements, so the algorithm processing will be much easier, and users can also experience the good positioning effect of Microsoft’s 90 frame rate or the 4 monocular cameras of Oculus products. The challenge facing Camsense is how to balance the hardware cost and excellent user experience; and how to respond to the diverse needs of customers, and finally achieve the consistency of tracking results.
The advantages of Camsense XR are mainly reflected in the following two aspects:
The first is that the localization algorithm can achieve the same performance of high-profile hardware on low-profile hardware. Camsense XR mainly relies on high-precision positioning algorithms, but at least two monocular lenses with a frame rate of 30 frames and a resolution of VGA can be used, which can achieve the dynamic positioning accuracy of other manufacturers at a frame rate of 60 frames. This is the key to Camsense's consistent tracking effect for a variety of customer needs. Due to the diverse needs of customers, it is necessary to adapt to different processing terminals. The PC terminal can handle high-performance computing, but the VR all-in-one machine based on the mobile phone processing solution cannot consume too much performance in positioning. Achieving high-quality tracking effects on the basis of hardware has become a topic that needs to be overcome.
Christopher mentioned that there are three computing resources in the device, one is GPU, which is mainly used for parallel computing scenarios such as image processing or games; the other is DSP, which is mainly used to run computer vision algorithms such as head VSLAM or deep learning algorithms; and One is ARM computing resources, which are only used for lightweight computing, such as WeChat, Alipay and other common APP programs on mobile phones. The Camsense XR uses the third computing resource - ARM, so the memory usage is only 50M, and the 1.95GHz dual-core CPU can run.
After all, reducing power consumption only solves the problems of adaptability and cost, and user experience is also a challenge. Now that the hardware foundation has been lowered, how to ensure that the experience effect is not affected? Christopher said that the lack of hardware can only be made up for with high-precision algorithms. Camsense XR's algorithm for tracking in complex environments, from head end to chip processing to terminal interaction, is inherited from its industrial-grade sub-sub-pixel high-precision positioning algorithm. Average error, this technology has been granted an invention patent by the US Patent Office this year.
VR Tuoluo noticed that during the interview, Christopher has been emphasizing on making "meaningful" products. From the perspectives of consumers, customers, enterprise developers, products and the entire market environment, he constantly refines the product definition of Camsense XR, even if he has the ability to invest more money and time to achieve better functional effects, such as head SLAM positioning, but on the other hand loses the uniqueness of Camsense itself.
With Daydream and Gear VR fading out of history, the 3DoF era is over. The core of 6DoF products - location tracking, will also become more and more important. And this will bring more opportunities for Camsense.