10 Dec 21:51

YouTube’s new auto-dubbing feature is now available for knowledge-focused content

by Lauren Forristal

YouTube announced on Tuesday that its auto-dubbing feature, which allows creators to generate translated audio tracks for their videos, is now rolling out to hundreds of thousands more channels. YouTube first introduced its AI-powered auto-dubbing tool at Vidcon last year, which was only being tested with a limited group of creators. This tool could help […]

10 Dec 20:51

Where Health Insurance Comes From in the United States

by Nathan Yau

About half of people have private health insurance through an employer. However, the other half get their insurance from elsewhere or through a combination of sources. This is where everyone gets their coverage from.

Read More

10 Dec 20:46

Actualité : La Chine a conçu un implant cérébral bluffant pour concurrencer Neuralink

by Nassim Chentouf

La course aux implants cérébraux est lancée et NEO rejoint la compétition. L’essai clinique à grande échelle est prévu pour 2025 alors que la pose n’a demandé qu’une heure et quarante minutes via un système innovant.L'implant cérébral NEO est semi-invasifSur ses réseaux sociaux, le Shanghai Science & Technology a présenté son NEO aux résultats remarq...

10 Dec 17:26

Solos challenges Meta’s Ray-Bans with $299 ChatGPT smart glasses

by Jess Weatherbed

The Solos AirGo Vision smart glasses in the krypton 1 frame style. — Image: Solos

Solos’ camera-equipped smart glasses have arrived to provide some much-needed competition against Meta’s Ray-Bans. The AirGo Vision is available now starting at $299 — the same price as the Ray-Ban Meta eyewear tech — and features integration with OpenAI’s GPT-4o AI model to identify and answer questions about the people, objects, and text seen by the camera.

That allows the AirGo Vision to do things like translate text into different languages, provide directions to nearby locations or landmarks, and give the wearer more information about what they’re looking at. Solos says the glasses can also be integrated with other AI models like Google Gemini and Anthropic’s Claude, something the company previously teased when it announced the AirGo Vision in June.

Like the Ray-Ban Meta Smart Glasses, the AirGo Vision camera can capture photos on demand. A swappable frame system means that you can wear the glasses with or without the camera — the battery and touch sensors used to control the device are housed in the frame’s USB-C chargeable hinges, providing an audio-only option when paired with the standard, no-camera-included AirGo frames.

“One thing we promised to deliver on was allowing consumers to have control of their experience with AI and smart technology, particularly with privacy options in mind,” Solos co-founder Kenneth Fan said in the announcement. “That’s why we developed frames that can easily be changed to decide when and where a camera may be appropriate without sacrificing any of the fun features.”

*Here’s a frontal view of the Krypton 1 frame style...*

*...compared to the slimmer Krypton 2 design.*

Soros says the Vision comes “with the option to purchase the frame only for $149 or bundle a camera frame with a regular frame for enhanced privacy, priced at $349.” It’s available in seven colors and two frame styles: Krypton 1, which sports a large square design with prominent nose pads, and the slimmer Krypton 2.

10 Dec 16:34

Smart TVs collect viewing data even when used as external screens, according to research

A team from Universidad Carlos III de Madrid (UC3M), in collaboration with University College London (England) and the University of California, Davis (U.S.), has found that smart TVs send viewing data to their servers. This allows brands to generate detailed profiles of consumers' habits and tailor advertisements based on their behavior.

10 Dec 16:30

The creator of ChatGPT’s voice wants to build the tech from ‘Her,’ minus the dystopia

by Maxwell Zeff

Alexis Conneau thinks a lot about the movie “Her.” For the last several years, he’s obsessed over trying to turn the film’s fictional voice technology, Samantha, into a reality. Conneau even uses a picture of Joaquin Phoenix’s character in the movie as his banner on Twitter. With ChatGPT’s Advanced Voice Mode, a project Conneau started […]

10 Dec 16:22

Microsoft’s AI boss and Sam Altman disagree on what it takes to get to AGI

by Wes Davis

Photo of Mustafa Suleyman. — *Microsoft AI CEO Mustafa Suleyman at the UK AI Safety Summit in November 2023.* | Photo by Leon Neal / Getty Images

Microsoft AI CEO Mustafa Suleyman disagrees with OpenAI CEO Sam Altman’s recent claim in a Reddit AMA that artificial general intelligence, or AGI, is possible on today’s hardware. While AGI is “plausible,” he tells The Verge’s Nilay Patel in the latest Decoder episode that it could take as long as 10 years to achieve.

With current hardware defined by Nilay as “within one or two generations of what we have now, I would say,” Suleyman replied, explaining why he thinks that’s unlikely:

I don’t think it can be done on [Nvidia] GB200s. I do think it is going to be plausible at some point in the next two to five generations. I don’t want to say I think it’s a high probability that it’s two years away, but I think within the next five to seven years since each generation takes 18 to 24 months now. So, five generations could be up to 10 years away depending on how things go.

“The uncertainty around this is so high,” Suleyman said, “that any categorical declarations just feel sort of ungrounded to me and over the top.”

He’s also drawing a line between AGI and the “singularity”:

It depends on your definition of AGI, right? AGI isn’t the singularity. The singularity is an exponentially recursive self-improving system that very rapidly accelerates far beyond anything that might look like human intelligence.

To me, AGI is a general-purpose learning system that can perform well across all human-level training environments. So, knowledge work, by the way, that includes physical labor. A lot of my skepticism has to do with the progress and the complexity of getting things done in robotics. But yes, I can well imagine that we have a system that can learn — without a great deal of handcrafted prior prompting — to perform well in a very wide range of environments. I think that is not necessarily going to be AGI, nor does that lead to the singularity, but it means that most human knowledge work in the next five to 10 years could likely be performed by one of the AI systems that we develop. And I think the reason why I shy away from the language around singularity or artificial superintelligence is because I think they’re very different things.

The challenge with AGI is that it’s become so dramatized that we sort of end up not focusing on the specific capabilities of what the system can do. And that’s what I care about with respect to building AI companions, getting them to be useful to you as a human, work for you as a human, be on your side, in your corner, and on your team. That’s my motivation and that’s what I have control and influence over to try and create systems that are accountable and useful to humans rather than pursuing the theoretical super intelligence quest.

Last week, during The New York Times DealBook Summit, Altman set out a lower set of goalposts for AGI than the superintelligence-style phenomenon he’s described in the past.

Now, Altman says AGI will arrive “sooner than most people in the world think and it will matter much less.” And when it comes to superintelligence, “a lot of the safety concerns that we and others expressed actually don’t come at the AGI moment. AGI can get built, the world mostly goes on in mostly the same way, things grow faster, but then there is a long continuation from what we call AGI to what we call superintelligence.”

This is a relationship that appears strained only one year after Microsoft helped reseat Altman as OpenAI’s CEO. After confirming that Microsoft is working on its own frontier AI model capable of competing at the “GPT-4, GPT-4o scale,” Suleyman also commented on the tension between Microsoft and OpenAI:

Every partnership has tension. It’s healthy and natural. I mean, they’re a completely different business to us. They operate independently and partnerships evolve over time... partnerships evolve and they have to adapt to what works at the time, so we’ll see how that changes over the next few years.

10 Dec 16:21

Visual Positioning Systems: what they are, best use cases, and how they technically work

by Skarredghost

Today I’m writing a deep dive into Visual Positioning Systems (VPS), which are one of the foundational technologies of the future metaverse. You will discover what a VPS service is, its characteristics, and its use cases, not only in the future but already in the present. As an example of a VPS solution, I will give you some details about Immersal, which is one of the leading companies for what concerns this technology. There is a lot to say and I’m sure you will find this article super informative, so let’s go!

[Disclaimer: this is a paid article built in collaboration with Immersal. In this blog, paid articles maintain the same objectivity, passion, and detail as non-paid ones. They are also completely written by me. A company can pay for an article just to be sure that I mention its product and I publish the post within a certain timeframe. That’s why I don’t call them “sponsored” articles, but “paid”: I’m not here to sell you anything, just to inform you, as usual.]

What is VPS?

vps google — A VPS system detecting visual features in the surrounding environment (Image by Google)

VPS stands for Visual Positioning System. Slightly modifying Niantic’s definition, we can say that “a VPS is a cloud service that enables applications to localize a user’s device at real-world locations. Usually, this is used to let users interact with persistent AR content“.

If you want a more technical definition, Immersal has a good one for you: “A Visual Positioning System (VPS) utilizes sophisticated computer vision methods to determine a device’s position and orientation within an environment in real time. It works by processing camera images and analyzing the resulting data together with a database of spatial maps. By recognizing visual cues in the data and understanding their relationship to each other, VPS can accurately localize the device and its orientation within the environment“.

Putting it in layman’s terms, a VPS is a service that detects what is the exact position and rotation of your device (e.g. your phone) in relation to a physical place, so that you can correctly interact with AR content that is put in that place. Let’s make an example to explain it better: imagine that you want to create an AR experience in the middle of a park in your city so that a big virtual dragon comes out from a certain fountain. You want all users to see the dragon coming out from the middle of the fountain, no matter where they are in the park. So the devices the users are using, either phones or AR glasses, must have a way to know their exact position and orientation with regard to the fountain so that they all can put the dragon in exactly the same physical location. The best solution you have for this is to use a VPS service.

Why do we need VPS?

niantic vps — Niantic VPS used to show AR elements at a landmark position (Image by Niantic)

I can hear some of you saying “Why do we need VPS when we have other technologies to map where the users are in a place?”. You are right, we have many tracking technologies, and every one has its own use case, with VPS being unbeatable to accurately find the pose of your device with regard to a large physical location:

GPS is great for giving the user coarse information about his/her geographical location. GPS together with the sensors on the phone is all that we need to orient ourselves on a 2D map like Google Maps that we use every day. The problem with GPS is that it gives a coarse location: the usual error in the detection is 1-5m, which is irrelevant on a 2D map, but becomes a problem when for instance I want to put some information in AR on the window of a shop. 5 meters of errors means that the info could be added to the next-door shop, instead;
AR libraries like ARKit, ARCore, or even Meta Insight on Quest, are fantastic for local tracking. If you are playing an AR experience in a room, they are the way to go. But first of all, usually, they do not detect in what place they should start (unless some cloud anchor is used), they just start in the user’s room and use some local surfaces as a reference system. Then they are made for narrow places, and if you start moving very distant from the initial position, the tracking starts to drift and the virtual elements start moving away from their initial positions, detaching from the physical world;
2D Markers… I mean, they feel a bit old. If your experience is tailored to a specific planar image, they are the way to go, but this is not a common scenario outdoors for instance. Unless you want to put a huge textured blanket in the park, you can not use markers to show the dragon on the fountain in the above example. Furthermore, users should always frame the marker to see the augmentations, and this is annoying because it forces users to always look down;
3D Markers: better than the above scenario, but they need you to have an accurate 3D mesh reconstruction of the element to augment and then train some ML classifier to detect the object (which may take a lot of time). Augmentations work only if the 3D element to use as a 3D marker is currently visible. They are very useful if your purpose is to augment a specific physical object, but are still pretty cumbersome and sometimes pretty expensive.

This is a video showing Augmented Reality with 2D markers in 2008 using a Nokia N95 phone. When I say that markers are an old technology, I really mean it!

All the above technologies have their specific use cases, but VPS services are the best technology available to guarantee that the device detects its absolute position and orientation in a certain indoor or outdoor location, even a pretty large one. It is the technology to use when you want to augment a specific place for multiple people in a coherent way.

Use cases of VPS

Before digging into the details of how a VPS system works under the hood, let’s evaluate its use cases.

The first one that comes into mind when talking about having the users know their position in the space, is building an indoor navigation system. Imagine being in a big shopping center, looking for a specific shop: personally, when I do this, if I try to follow the indications of the maps scattered around the place, I get lost 100% of the time. It would be great if you could have an AR system that would show on your phone screen some arrows that tell you the way from where you are to the shop you want to reach. VPS systems can help build exactly that: since they can localize the position and rotation of every device, they know where the user is and can guide him/her until destination. Immersal has in fact developed a similar solution for Mall of Tripla, an 85000 m2 shopping mall in Helsinki. But we can think about other situations where indoor navigation may be very helpful, like hospitals or airports.

Immersal indoor navigation system

Another use case is the superimposition of virtual elements on a building for industrial use cases. For instance, a pretty common request for AR applications is being able to see the network of pipes superimposed on the floor or ceiling of buildings, or even outside in the streets, to facilitate the work of maintenance workers. The right technology to do this is again VPS because it can track the pose of your phone across a large area and so it can help in superimposing the pipe system over the physical location. Immersal powered the AR4FM app by Granlund to provide this use case. Caverion AR by FlyAR had a similar function of overlaying BIM data on top of a real building for maintenance use cases.

Indoor navigation system, plus pipes visualization

Talking about more fun things, we can mention also entertainment and marketing. What if every child could see their favorite cartoon character in a specific place in a city? What if you could see augmented reality information overlayed on a stadium while you watch the match, no matter what seat you are in? What if there could be some virtual show happening in the middle of a commercial center to make your shopping experience more amusing? All these experiences need VPS to make sure the virtual elements are attached to the physical location they are augmenting.

MLB app made in collaboration with T-Mobile and Immersal, shows some information about the current match superimposed to the actual playground in real time

Seeing things more long term, a clear use case of VPS services is the metaverse, the forbidden M-word that now companies like to call “large-scale spatial computing”. The metaverse requires that all our reality becomes augmented and that we all consistently see the same augmentations in the same locations of the physical world. So if I could see an AR popup that informs me of some discount on a shop, everyone else should see it in the exact same location. The same if I saw a huge dragon in a fountain in the park: all the other people should see it in the exact same physical place, doing the exact same things. To make sure that we all can see these virtual elements in a consistent way all around our cities, we need a system that is able to accurately detect the position and rotation of the devices at city-scale. And this is exactly what a VPS service does.

VPSes are already useful now for some specific use cases, but long term they are the foundation of our long-term shared mixed-reality future, that is… the metaverse.

How does a VPS work?

If you are a tech guy like me, at this point, you are probably thinking “Ok Tony, I got that VPS can track the pose of my phone everywhere in a location, but how is this possible?”. Let me go a bit deeper into the technical details and explain to you all the process that makes a VPS service work.

Feature Detection

As in many modern functionalities based on computer vision, VPS relies on feature detection. According to the definition given by Immersal, a “Feature Point is a distinct, high-contrast visual feature in an image. A corner of a poster on the wall, the grain on a wooden floor or a detail in the facade of a building”. Trying to put also this definition in layman’s terms, we can say that a feature point is a point in an image showing a little corner. The more an area is textured, the more there will be feature points, because the more the texture, the more corners will be depicted in the image.

sift features — SIFT features detected in an image. Notice that the image areas that are more textured have more features (Image by OpenCV)

There are various types of feature points and many algorithms to detect them: if you are into computer vision, for sure you are familiar with terms like KLT, SIFT, SURF. The reason why it is important to detect these “corner” features is because corners have distinctive characteristics both on the X and the Y axes. Imagine being in front of a wall that is fully white in a room with no shadows, and even lighting. If I show you a video recorded with a phone moving in front of this wall, you just see full white in every frame, so you have no idea about how the phone is moving. Now imagine that there are vertical black stripes on the white wall: if the phone moves vertically, you see again the same striped pattern every frame, so you have no idea at what vertical speed I’m moving. But if I move horizontally, now you can detect the movement because of the vertical stripes moving in the video. If there are no stripes, but checkers, now you can spot both vertical and horizontal movements, but you still lack info about the absolute positioning of the phone. But if some of the checkers are blue, other red, other yellow, and they form a specific pattern, now you can detect exactly where is the phone because your brain can identify some specific patterns of the drawing on the wall and match them with the image that is portrayed in the video. This is why it is important to have some features with strong components on the X and Y axes: they are easier to uniquely identify and they can help spot movement on all axes.

VPS systems work in a similar way: they memorize the unique features in your space and then they localize your device by matching the features that the camera of the device is seeing with the features that the system knows that there are in that space.

Mapping

Now that it is clear what a feature is, we can examine the steps through which a VPS service functions. The first step that a VPS service should undergo to work in a specific space is mapping, that is the system should memorize what are the feature points that are available in the space inside which tracking should work. To do this, you usually need a companion app for mobile: Immersal has for instance the Immersal Mapper, which I tried in its offices in Helsinki.

Immersal Mapper app in use: you can see that moving around, it adds yellow dots to the scene. Those dots are the features detected in the scene

Immersal Mapper looks a bit like the camera app of your phone: you have to walk around the place where navigation should happen, and shoot pictures of it from different points of view so that the system can reconstruct the whole place. Immersal App has also an automatic mode, where you just walk around the place as if you were recording a video, and the system automatically shoots a new picture every time it thinks it is a good place to take it. After you have shot enough pictures, you can upload the data (which is a collection of images, and metadata associated with them, like the pose of the phone when the picture was shot) and let the cloud crunch it to reconstruct a point cloud of the place where you were in.

immersal scanning app vps — Me using a tablet to scan a room using the Immersal Mapper app

The cloud will extract the feature points of every picture and then merge the data of all these feature points to create a reconstruction of the place. I’m not going to describe here how the reconstruction algorithm works to not make you fall asleep out of boredom (the more geeky readers may look for “multiview stereo” online to read more about this, though), but you can imagine that a few things happen:

Only the feature points that are truly reliable are used for the reconstruction: all the feature points that appear in only one picture but disappear in the next ones are probably just the result of noise, so they are discarded;
The remaining “stable” feature points are matched the one with the others using the overlapping regions of the various images to reconstruct the shape of the whole place. For instance, if in an image the system detects the feature points of a door on the left to the ones of a desk, and in another image, there are the feature points of a desk on the left of the ones of a bookshelf, the system can use the desk overlap of the two images to reconstruct that on the side of the room there is a door, then a desk, then a bookshelf. Performing similar reasonings on all the images, the system gradually reconstructs the whole 3D shape of the space. This operation is similar to the “stitching” done with multiple flat videos that have to be merged into a 360 video.

Usually, there is a limit on the size of the maps that can be reconstructed with this operation, but the cool thing is that multiple maps can also be stitched together using the features of their overlapping areas. Thanks to this, VPS services can also work in big environments like university campuses or commercial centers. Actually, Immersal already aims at having city-scale mapping, that is having a big map of a whole city where a VPS tracking system may work.

The resulting point cloud of a mapping operation: this is made by 3 maps merged together (Image by Immersal)

All VPS systems perform mapping in a similar fashion, but not all of them have this operation done in an explicit way for the user. For instance, Google’s Geospatial VPS system does not ask the user to map the space, because it is Google itself that has already mapped many cities using the images it acquired for Google Maps. Niantic does the mapping under the hood using Pokemon Go players: players are encouraged to scan a new part of the city to have some reward inside the game, without being aware that they’re doing a mapping operation for a VPS system. I think that gamifying the mapping operation has been a genius idea by Niantic.

The result of the mapping operation is a point cloud of stable features that reconstructs the whole place. This can be used in the next step, which is the one of Localization.

Localization

Once the map is ready, most of the work has been done. You just have to run your application powered by VPS and make it confront the current images seen by the camera with the model of the place we have reconstructed with the mapping operation.

pnp vps pose camera reconstruction — The reconstruction of the pose of the camera relative to a world location knowing the data of a specific set of points is a well-known computer vision problem (Image by OpenCV)

At every frame, the system will grab a frame from the camera of the device, extract the features from it, and then confront the found features with the features of the model. Using some trigonometry magic (I could have said “boring stuff”, but “magic” sounds more exciting), it is possible to reconstruct the rotation and position of the camera by matching the pixel positions of the features found in the current frame with the 3D data characteristics of the same features recorded in the 3D model of the current place. Once the system has this absolute pose, it knows exactly where the user is in the place, and so it can show augmentations at exact physical positions. This can also be done for every user in the same location, guaranteeing that they are all seeing a consistent augmented reality.

Localization is what allows this game to be playable by all people in the stadium

When I tried Immersal in its offices, I remember that after scanning the room we were in and having the cloud reconstruct the point cloud of the place, we proceeded to visualize on the tablet the feature points cloud of the room super-imposed on the room itself. This was a good way to test the localization: if the tracking was working correctly, we could see the point cloud perfectly superimposed to the physical elements that compose it. And I can say that the system was working very well because the virtual points replicated exactly the shape of the physical room.

vps localization visualization — Localization preview on the Immersal old app: the red points are the reconstructed point cloud, which as you can see, fits perfectly the physical environment. This means that the application can perfectly map physical spaces and virtual elements

AR tracking

Once localization works, you can superimpose virtual elements to the room you are in, so as to offer augmented reality to the user. But doing VPS every frame is a very intensive operation for a mobile device, so usually the tracking of the device is performed with more lightweight standard SLAM technologies (e.g. ARKit, ARCore), but then every 1-5 seconds the tracking is corrected with the absolute pose offered by VPS. This creates a good combination of performance and reliability.

How do you develop an application using VPS?

If you want to implement VPS in your application, you usually rely on existing VPS services like Immersal, Google Geospatial, or Niantic Lightship. These services already take care of all the heavy lifting for what concerns the mapping and reconstruction algorithms, together with all the localization logic.

You usually have just to import the SDK of the platform you have chosen, and then use its scripts to do a couple of things:

Load the map of the place that you have recorded during the mapping operation. Usually, it is either a file that you downloaded from the mapping service, or it is a reference to a map that you have created in your user account of that VPS service;
Place the virtual objects. These services usually show inside the game engine that you have chosen a preview of the place you are going to augment, and they let you put the virtual 3D elements wherever you want.

Google Geospatial Unity SDK

With Google Geospatial SDK you can see the 3D map of the city and you can visually put virtual elements where you want them to appear

Immersal, for instance, has a Unity SDK that lets you preview in the editor the point cloud of the place you have mapped, so you can put the virtual elements in the 3D scene in a visual way. Then the scripts of the SDK simply do the magic of performing the localization and tracking every frame, alone or in combination with other services like AR Foundation.

If you want to go more low-level and use just the map to do some custom code about it yourself, you can still do it. From the Immersal servers, it is possible to download the following things for every saved map:

The map file with .bytes extension. This is the actual map file used by the SDK for localization.
A sparse point cloud representation of the map as a .ply file.
A dense triangle mesh representation of the map as a .ply file.
A textured triangle mesh representation of the map as a .glb file.

This gives the developer the maximum flexibility to develop the experience that he/she wants.

VPS Systems Characteristics

There are many VPS systems out there, and all of them have their own peculiarities. Let’s see some important characteristics to watch out for when you are looking for the system you should use.

Device compatibility

Not all VPS systems are compatible with all devices and before choosing a service, you should check if it works with the hardware you intend to use.

vps immersal compatibility platforms — The compatibility of Immersal both for mapping and localization (Image by Immersal)

Compatibility concerns both the mapping and the localization operations. Mapping may be done with different pieces of hardware: I told you about the mobile phone, but actually it can also be carried out with 360 cameras, Matterport scanners, LiDAR scanners, or drones. Immersal is compatible with all of these. It actually is also compatible with custom solutions: it is not even necessary for the client to use the official Immersal Mapper app.

As for localization, compatibility means understanding which devices may run the applications powered by VPS. Immersal here is very strong because it can work on:

Mobile devices that run ARKit, ARCore, or Huawei AR Engine
AR glasses like Magic Leap, HoloLens, XReal, Rokid
Mixed reality headsets like Pico 4E (a Vision Pro version is in the works)
All devices compatible with WebAR, including mini applications inside WeChat

The compatibility for Immersal with so many pieces of hardware is possible because the VPS servers just work with REST APIs, and these are platform-independent. If a new type of glasses is released, it is just necessary to make it communicate with the Immersal servers using these REST APIs to make it compatible with the system.

On-device vs on-cloud localization

Some VPS systems need a connection to the cloud to work. These systems perform all the heavy lifting on the cloud so that the application on the client can be more lightweight. Notice that I’m not talking about the mapping, which almost always needs the cloud to be performed, I’m talking about the localization. Localization on the device is lag-free and can work even in parts of the world with a bad internet connection, but it puts the local device under heavy stress (which also means faster battery consumption). Many VPS systems just work with on-cloud localization because it’s easier to manage for the provider (updates to the localization algorithms must only be delivered on the server) and allows the client to be more lightweight.

Immersal supports both of them and in fact, when you develop an application with its SDK, you are asked how to retrieve the map of the place that must be navigated. Since industrial clients care a lot about their private data and do not want to put the data about their factories on a random server on the Internet, Immersal also offers the possibility of having a local deployment of the VPS services inside the cloud space of the customer.

immersal ar map selection vps — The selection of the map to use inside the Immersal SDK (Image by Immersal)

Indoor vs Outdoor

Some services work better indoors, while others perform better outdoors. Some may have been more optimized for gaming scenarios, so to track elements that are close to the users, while others are more oriented toward navigation in larger spaces.

Indoor and outdoor tracking offer different challenges. Outdoor scenes are affected more by lighting, so performing localization at night when the scene was mapped during the day may present complications, because the features may appear differently in different light conditions. Indoor scenes have more uniform lighting, but they usually contain many challenging surfaces, like transparent glasses or mirrors that make tracking algorithms become confused.

Niantic has always promoted the “Real World Metaverse” because it has always been interested in outdoor augmented reality

Map scale

Some systems may work better in small spaces, while others may be oriented towards big areas. I’ve mentioned before the “city-scale” mapping that Immersal aims to and that is obtained by stitching many smaller maps together. Of course, this is also the mission of big players like Google and Apple.

Going city-scale introduces various challenges, like the fact that the whole map of a city can’t be contained by the host device, and anyway, the tracking can’t be done by comparing every time the current features with the one of the whole city. That’s why the map has to be broken into smaller chunks, that have to be quickly streamed (preferably via 5G) to the tracking device so that the user does not perceive any disruption of the service while he/she moves from one chunk to another one. Immersal demonstrated that its city-scale approach works by mapping a roughly 1,000,000m² area of Helsinki city center with 120+ separate maps that were aligned.

immersal helsinki vps mapping — The point cloud of the mapped area in Helsinki. It is pretty cool (Image by Immersal)

Openness

Some VPS systems just have their own pre-made maps, while others are open to you supplying your own maps of the places by scanning the environments. Some of them also let you connect to open systems like the Open AR Cloud, which is an open-source 3D map of the world.

Google Geospatial has for instance the handicap that you are in the hands of Google: you can not scan a place yourself, either Google mapped a location well or it has not.

Immersal claims to be a fairly open system, a toolbox that the customers can use as they want, even mixing their own tools with the one of Immersal.

Pricing

VPS solutions have different prices: usually, they are free to start with, but then there is a monthly fee to pay in case you want to build more professional applications. Immersal is free to experiment with, but a Pro license costs $99/month and an Enterprise one requires a private negotiation. (I also obtained that you readers can have one free month of Pro subscription if you use the special code SKARREDGHOST at checkout!)

When evaluating the solution that fits you, you should also verify which one is ideal for your budget capabilities.

Available VPS Systems

apple ar geotracking vps — An image from Apple’s ARKit documentation about Apple VPS system (Image by Apple)

If you want to know some names of famous VPS systems to investigate, here are a few:

When I asked Immersal engineers for an honest comparison of their system with the other ones available, I was told that Google Geospatial is usually very good for outdoor locations with meter-accuracy, but its performances depend on how Google has mapped the place where the app should run. But for outdoor locations that are not tracked well, or for indoor locations, or if you need to customize the map, or you need centimeter-accuracy, Immersal should offer better performances.

Niantic Lightship, instead, works well for gaming use cases, and thanks to the fact that the map of the world is crowdgenerated, it always expands to new locations. However, industrial companies may not be very happy with seeing their industrial factories being mapped and inserted in the public 3D map of a gaming company. So for B2B use cases, Immersal should offer more data safety.

I have not personally verified these claims with a personal objective test, so take this opinion with a grain of salt. As usual, my suggestion is to try things by yourself: if you need a VPS service, choose three of them that on paper fit better with the needs that you have and then try them on the field and see what works better in your actual conditions.

Conclusion

VPS systems are foundational for our future, which will be made of a shared persistent mixed reality. The technology that powers them is not easy to develop, but luckily there are already existing SDKs that do the heavy lifting for us. Immersal is one of the companies offering these services and I have been able to verify with my own eyes that it does a pretty good job.

I hope that this article has been able to foster in you some curiosity about VPS and will entice you to use this kind of service for some applications that are useful for you. And if you have any questions, of course, you can ask them in the comments and I will do my best to support you!

(Header image by Immersal)

The post Visual Positioning Systems: what they are, best use cases, and how they technically work appeared first on The Ghost Howls.

10 Dec 14:32

Obesity rates are down. Is that because of weight-loss drugs?

by Joshua Cohen, Undark Magazine

Earlier this fall, the Centers for Disease Control and Prevention reported data showing that adult obesity rates—long trending upwards—had fallen modestly over the past few years, from 41.9 to 40.3 percent. The decline sparked discussion on social media and in major news outlets about whether the US has passed so-called “peak obesity”—and whether the growing use of certain weight-loss drugs might account for the shift.

An opinion piece in the Financial Times suggested that the public health world might look back on the current moment in much the same way that it now reflects on 1963, when cigarette sales hit their high point and then dropped dramatically over the following decades. The article’s author, John Burn-Murdoch, speculated that the dip is “highly likely” to be caused by the use of glucagon-like peptide-1 receptor agonists, or GLP-1s, for weight loss.

It's easy to see why one might make that connection. Although GLP-1s have been used for nearly two decades in the treatment of type 2 diabetes, their use for obesity only took off more recently. In 2014, the Food and Drug Administration approved a GLP-1 agonist named Saxenda specifically for this purpose. Then in the late 2010s, a GLP-1 drug named Ozempic, made from the active ingredient semaglutide, began to be used off-label. The FDA also authorized Wegovy, another semaglutide-based GLP-1 medication, explicitly for weight loss in 2021.

Read full article

Comments

View attached file (scale-1152x648.jpg, image/jpeg)

10 Dec 14:31

Où se cache Bachar al-Assad ? La traque de son avion sur les réseaux sociaux sème le doute

by Bogdan Bodnar

Selon les agences de presse russe, l'autocrate syrien Bachar al-Assad serait arrivé en Russie après la chute de son régime. Plusieurs avions ont été localisés dans le ciel syrien, avec des trajets suspects, qui ont fait émerger de nombreuses théories.

10 Dec 14:31

John Deere Announced Nearly 200 More Layoffs at Its Iowa Plants During the Holidays. Here’s Why

by Bernadette Giacomazzo

John Deere continues downsizing.

View attached file (76bGdKnfk-8, unknown)

10 Dec 14:29

OpenAI has finally released Sora

by Kylie Robison

A screenshot of Sora — *Free users can still browse a feed of AI-generated videos created by the community.* | Screenshot: OpenAI

OpenAI launched Sora, its text-to-video AI model, on Monday as part of its 12-day “ship-mas” product release series, as The Verge previously reported it would. It’s available today on Sora.com for ChatGPT subscribers in the US and “most other countries,” and a new model, Sora Turbo. This updated model adds features like generating video from text, animating images, and remixing videos.

With a ChatGPT Plus subscription, OpenAI says you can generate up to 50 priority videos (1,000 credits) at resolutions up to 720p with 5-second durations. The $200 per month ChatGPT Pro subscription that launched last week comes with “unlimited generations” and up to 500 priority videos while bumping the resolution to 1080p and the duration to 20 seconds. The more expensive plan also allows subscribers to download videos without a watermark and perform up to five generations simultaneously.

OpenAI first teased its text-to-video AI model, Sora, in February, and earlier today, Marques Brownlee, aka MKBHD, confirmed the launch with a preview based on his experiences testing Sora so far.

During the livestream, OpenAI showed off Sora’s new explore page with a feed of AI-generated videos created by other community members. The company highlighted a feature called “storyboards” that let you generate videos based on a sequence of prompts, as well as the ability to turn photos into videos. OpenAI also demonstrated a “remix” tool that lets you tweak Sora’s output with a text prompt, along with a way to “blend” two scenes together with AI.

OpenAI says videos generated with Sora will have visible watermarks and C2PA metadata to indicate they’re made with AI. Before uploading an image or video to Sora, OpenAI prompts you to check off an agreement that says what you’re uploading doesn’t contain people under 18, explicit or violent content, and copyrighted material. It says the “misuse of media uploads” could result in an account ban or suspension.

“We obviously have a big target on our back as OpenAI,” Sora product lead Rohan Sahai said during the livestream. “We want to prevent illegal activity of Sora, but we also want to balance that with creative expression. We know that... will be an ongoing challenge, we might not get it perfect on day one. We’re starting a little conservative, and so if our moderation doesn’t quite get it right, just give us that feedback.”

If you don’t have a ChatGPT subscription, you’ll still be able to browse through the feed of AI-generated videos created by other people using Sora. While the model will become available in the US and many other countries today, OpenAI CEO Sam Altman said that it may “be a while” for a launch in “most of Europe and the UK.”

The release of Sora comes just a week after a group of artists, who claimed to be part of the company’s alpha testing program, leaked the product in protest of being used by OpenAI for what they claim was “unpaid R&D and PR.”

Correction, December 9th: The quote previously attributed to Aditya Ramesh was actually said by Rohan Sahai.

10 Dec 14:28

Scientists create AI that 'watches' videos by mimicking the brain

Imagine an artificial intelligence (AI) model that can watch and understand moving images with the subtlety of a human brain. Now, scientists at Scripps Research have made this a reality by creating MovieNet: an innovative AI that processes videos much like how our brains interpret real-life scenes as they unfold over time.

10 Dec 14:27

Offensive du Crédit Mutuel sur FIDA

by Patrice

La perspective de l'ouverture généralisée des données financières telle qu'elle est concoctée par les instances européennes est encore lointaine mais les réactions des principales intéressées ne tardent pas à se faire entendre. Est-on surpris que le Crédit Mutuel, détracteur acharné de la DSP2 précurseuse, soit en pointe des critiques ?

La réglementation FIDA qui se prépare laborieusement à Bruxelles n'est finalement qu'une extension logique des exigences qui s'imposent depuis 2019 sur les seuls comptes de paiement. En l'état du projet, elle assujettira ainsi toutes les institutions financières aux mêmes contraintes de partage, avec les organisations habilitées, des informations qu'elles hébergent concernant tous les produits détenus par leurs clients. Ce que la Confédération Nationale du Crédit Mutuel, par la voix de sa directrice générale Isabelle Ferrand, considère donc représenter un danger insoutenable.

Ses arguments, inchangés depuis plusieurs années, persistent à ignorer les réalités du monde « digital » contemporain… et l'expérience accumulée depuis le texte précédent. Il est toujours question de risque pour la sécurité des comptes, de perte de souveraineté, de création d'inégalités… En revanche, et c'est le premier trou béant dans le raisonnement adopté, n'est pas soulignée l'évidence factuelle qui devrait concentrer les débats : les données financières des utilisateurs de services leur appartiennent et qu'elles soient conservées par un tiers ne lui en attribue pas pour autant la propriété !

L'opposition à toute ouverture est en réalité un réflexe d'autodéfense égoïste. Quelles peuvent-en être les motivations profondes ? Il faut d'abord parler du coût de mise en œuvre, forcément élevé au vu de la situation des systèmes d'information préhistoriques qui prévalent dans le secteur. Ensuite, plus sournoisement, il existe peut-être également une inquiétude sur ses conséquences : des entreprises créatives sont susceptibles de s'emparer de l'opportunité en vue de développer les fonctions innovantes qu'attendent les clients et que s'avèrent incapables de leur fournir leur banque habituelle.

Même si cela ne plaît pas au Crédit Mutuel, ce serait une victoire pour les promoteurs de la législation, dont un objectif majeur reste la stimulation de la concurrence. En outre, elle constituerait potentiellement un facteur de maintien de la souveraineté européenne (et éventuellement hexagonale) car, à armes égales, les acteurs locaux auront autant – voire plus – de chances de concevoir et déployer des offres qui correspondent aux besoins dont ils sont proches. Alors qu'aujourd'hui, les géants américains sont en mesure de profiter de l'immobilisme de l'industrie financière traditionnelle.

Les autres justifications brandies par Mme Ferrand n'ont pas plus de matérialité. Dans le registre de la sécurité, par exemple, cinq ans de DSP2 ont démontré que les garde-fous mis en place fonctionnent correctement. Mais il s'agit bien entendu d'un épouvantail (éculé) destiné à effrayer ceux qui seront appelés à valider la proposition de la Commission Européenne sans toujours prendre le temps de rationaliser le tapage médiatique, qu'il est donc important pour ses adversaires de déclencher au plus tôt.

10 Dec 14:27

Video: In Europe, new highway tech and robots could soon fix roads and protect lives

Europe's road network is its economic backbone. Mostly constructed after World War II, extensive maintenance is essential as it's nearing its end of life. Increasing traffic volumes and more frequent road works result in traffic jams, delayed goods transport and risks for road workers. All this puts huge pressure on governments and road authorities.

10 Dec 14:25

Meta’s new Quest update has faster hand tracking and at-a-glance PC connections

by Jay Peters

A photo of the Quest 3 and its controllers. — Photo: David Pierce / The Verge

Meta has announced the v72 Quest update, and it’s packed with features like faster hand tracking, an easier way to pair your headset with a Windows 11 PC, and better support for showing your keyboard while you’re in full virtual reality. The update is rolling out gradually, which also goes for certain features so you may not be able to use them immediately.

Meta says you can now connect to a paired PC with the Quest’s Remote Desktop feature simply by looking at it and tapping the “Connect” button that appears above your keyboard. That’s similar to how it works on the Vision Pro, but here, you’ll need the Mixed Reality Link app installed on your computer before you can pair the devices together from within your Quest headset’s Settings app. The feature requires Windows 11 22H2 and newer.

A screenshot showing the new PC-connecting feature in action. — *Now you can connect to your PC just by looking at it.*

Also, in Quest v72, the company says it’s “rolling out a more general keyboard tracking system” that should detect and let any keyboard around you appear through a passthrough “window” while you’re in a virtual environment, similar to the Vision Pro. Quest headsets have had a feature that shows a virtual version of your keyboard where your real one is since 2021, but that has only ever worked with specific keyboards.

Meta also says it has made the hand cursor more stable when navigating, pinching to select things, and pinching and dragging windows. The company also says it’s now easier to use your hands while in confined spaces and that it added a “hand ray visualization” to help find and target things with the cursor.

There is a little bit more in the update, too, including new live captions for calls from the People app and the addition of direct messaging in the Instagram app. Meta also added a Media Gallery app for viewing your images, videos (spatial included), and screenshots.

10 Dec 14:25

Google reveals quantum computing chip with ‘breakthrough’ achievements

by Emma Roth

An image showing Google’s quantum computing chip — Image: Google

Google’s quantum computing lab just achieved a major milestone. On Monday, the company revealed that its new quantum computing chip, Willow, is capable of performing a computing challenge in less than five minutes — a process Google says would take one of the world’s fastest supercomputers 10 septillion years, or longer than the age of the universe.

That’s a big jump from 2019 when Google announced its quantum processor could complete a mathematical equation in three minutes, as opposed to 10,000 years on a supercomputer. IBM disputed the claim at the time.

Along with more powerful performance, researchers also found a way to reduce errors, something Google calls “one of the greatest challenges in quantum computing.” Instead of bits, which represent either 1 or 0, quantum computing uses qubits, a unit that can exist in multiple states at the same time, such as 1, 0, and anything in between.

As noted by Google, qubits are prone to errors because they “have a tendency to rapidly exchange information with their environment.” However, Google’s researchers discovered a way to reduce errors by introducing more qubits to a system and were able to correct them in real time. Their findings were published in Nature.

“This historic accomplishment is known in the field as ‘below threshold’ — being able to drive errors down while scaling up the number of qubits,” Google Quantum AI founder Hartmut Neven writes on Google’s blog. “You must demonstrate being below threshold to show real progress on error correction, and this has been an outstanding challenge since quantum error correction was introduced by Peter Shor in 1995.”

Introducing Willow, our new state-of-the-art quantum computing chip with a breakthrough that can reduce errors exponentially as we scale up using more qubits, cracking a 30-year challenge in the field. In benchmark tests, Willow solved a standard computation in <5 mins that would…
— Sundar Pichai (@sundarpichai) December 9, 2024

Willow, which has 105 qubits, “now has best-in-class performance,” according to Neven. Microsoft, Amazon, and IBM are working on quantum computing systems of their own.

Google’s next goal is to perform a first “useful, beyond-classical” computation that is both “relevant to a real-world application” and one that typical computers can’t achieve. Going forward, Neven says quantum technology will be “indispensable” for collecting AI training data, eventually helping to “discover new medicines, designing more efficient batteries for electric cars, and accelerating progress in fusion and new energy alternatives.”

10 Dec 14:04

Google dévoile GenCast : un modèle d’IA révolutionnaire pour la prévision météorologique

by Benjamin

Avec GenCast, Google réinvente la prévision météorologique en utilisant l’intelligence artificielle pour des résultats plus précis.

09 Dec 14:10

Robot Rodents: How AI Learned to Squeak and Play

by Heidi Ulrich

Render of life-size robot rat animatronic on blue plane

In an astonishing blend of robotics and nature, SMEO—a robot rat designed by researchers in China and Germany — is fooling real rats into treating it like one of their own.

What sets SMEO apart is its rat-like adaptability. Equipped with a flexible spine, realistic forelimbs, and AI-driven behavior patterns, it doesn’t just mimic a rat — it learns and evolves through interaction. Researchers used video data to train SMEO to “think” like a rat, convincing its living counterparts to play, cower, or even engage in social nuzzling. This degree of mimicry could make SMEO a valuable tool for studying animal behavior ethically, minimizing stress on live animals by replacing some real-world interactions.

For builders and robotics enthusiasts, SMEO is a reminder that robotics can push boundaries while fostering a more compassionate future. Many have reservations about keeping intelligent creatures in confined cages or using them in experiments, so imagine applying this tech to non-invasive studies or even wildlife conservation. In a world where robotic dogs, bees, and even schools of fish have come to life, this animatronic rat sounds like an addition worth further exploring. SMEO’s development could, ironically, pave the way for reducing reliance on animal testing.

View attached file (robot-rats-1200.jpg, image/jpeg)

09 Dec 14:06

Furless Furby

by staff

Meet the Furless Furby, your childhood friend stripped bare – literally. With 100% less fur and 1000% more nightmare fuel, it’s the perfect blend of nostalgia and chaos. Gift it, meme it, or just let its folds haunt your decor.

Check it out

$19.98

Jean-Philippe Encausse likes this

09 Dec 08:25

New Tullomer Filament Claims to Beat PEEK

by Maya Posch

Recently a company called Z-Polymers introduced its new Tullomer FDM filament that comes with a lofty bullet list of purported properties that should give materials like steel, aluminium, and various polymers a run for their money. Even better is that it is compatible with far lower specification FDM printers than e.g. PEEK. Intrigued, the folks over at All3DP figured that they should get some hands-on information on this filament and what’s it like to print with in one of the officially sanctioned Bambu Lab printers: these being the X1C & X1CE with manufacturer-provided profiles.

The world of engineering-grade FDM filaments has existed for decades, with for example PEEK (polyether ether ketone) having been around since the early 1980s, but these require much higher temperatures for the extruder (360+℃) and chamber (~90℃) than Tullomer, which is much closer (300℃, 50℃) to a typical high-performance filament like ABS, while also omitting the typical post-process annealing of PEEK. This assumes that Tullomer can match those claimed specifications, of course.

One of the current users of Tullomer is Erdos Miller, an engineering firm with a focus on the gas and oil industry. They’re using it for printing parts (calibration tooling) that used to be printed in filaments like carbon fiber-reinforced nylon (CF-PA) or PEEK, but they’re now looking at using Tullomer for replacing CF-PA and machined PEEK parts elsewhere too.

It’s still early days for this new polymer, of course, and we don’t have a lot of information beyond the rather sparse datasheet, but if you already have a capable printer, a single 1 kg spool of Tullomer is a mere $500, which is often much less or about the same as PEEK spools, without the requirement for a rather beefy industrial-strength FDM printer.

View attached file (z-polymers_tullomer_filament_prints_featured.jpg, image/jpeg)

08 Dec 00:03

US Military Alarmed by Russian Nuclear Weapon Platform in Orbit

by Noor Al-Sibai

A Russian spacecraft launched higher than most satellites has long had the Pentagon worried — and it apparently has a space nuke on it.

Dumb War

A Russian spacecraft launched higher than most satellites has long had the Pentagon worried — and new revelations about what it contains have made those concerns all the greater.

Launched in February 2022 just a few weeks before Ukraine was invaded, Russia's Cosmos 2553 spacecraft is nominally built to test out "newly developed onboard instruments and systems." According to new reporting from the New York Times, however, the mysterious satellite system contains a "dummy warhead" — a precursor of what could come should the Russians decide to arm the craft for real.

As scary as the concept of a space nuke sounds, it wouldn't necessarily harm life on Earth — unless you consider eliminating all satellites in its vicinity harm, in which case the people down on the planet below would be seriously screwed.

ASAT Stats

Back in 1962, the US military actually did detonate a nuclear weapon in space, though the damage from the electromagnetic pulse it emanated seems mostly to have been limited to streetlights dimming in Hawaii, which was below the test.

Scientists learned from that formerly-classified test that doing so was probably a pretty bad idea, and in 1967, both Russia and the United States signed the Outer Space Treaty to prevent, essentially, space warfare. In the years since, however, concerns have grown that Russia may violate the treaty — especially as more and more communications satellites began littering our planet's orbit.

After Russia released Cosmos 2553 some 250 miles above the planet's surface, military experts became concerned that it might be a secret nuclear weapon. As the NYT's new reporting reveals, the US Space Force and a group of intelligence agencies have quietly been looking into the satellite to try to figure out its real purpose.

Throughout 2024, more and more information about the alleged anti-satellite weapon began to trickle out of Washington. In response, Russian President Vladimir Putin has repeatedly denied that it's any such thing — though notably, it doesn't appear he's made any such denial since the NYT reported that Cosmos 2553 contains a dummy warhead.

Despite those refutations, Russia vetoed in April a United Nations resolution that would bar nuclear weapons in space. If the NYT's reporting holds up, we may know why.

More on Russian crafts: Insane Video Shows Reckless Russian Fighter Jet Rip Right Past an F-16

The post US Military Alarmed by Russian Nuclear Weapon Platform in Orbit appeared first on Futurism.

06 Dec 16:16

Researchers put bird legs on a drone so it can take off by jumping

by Andrew Liszewski

EPFL’s RAVEN drone shown posed in flight with its bird-inspired legs dangling. — *EPFL’s RAVEN drone trades traditional landing gear for a pair of legs that function similar to a bird’s.* | Image: Alain Herzog

Researchers from the École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland and UC Irvine have developed a drone capable of landing and taking off in areas that would otherwise leave a fixed-wing aircraft stranded. Their Robotic Avian-inspired Vehicle for multiple ENvironments (RAVEN) trades traditional landing gear for a pair of bird-inspired articulated legs that allow the drone to walk around, hop over obstacles, and even leap into the air to take flight without the need for a runway.

Quadcopter drones may offer more flexibility when it comes to where they can take off and land, but most rely on four motors which are less energy-efficient than fixed-wing drones that use a single motor paired with gliding for flight. To expand the capabilities of fixed-wing drones, the researchers took inspiration from birds like crows and ravens which can easily maneuver on the ground using a scrawny pair of legs, as detailed in a paper published in Nature this week.

A close-up look at the EPFL’s RAVEN drone’s legs and feet. — *RAVEN’s legs and feet use a simplified design but still offer enough articulation for the drone to maneuver on the ground.*

Recreating the strength and capabilities of a bird’s legs mechanically without adding significant weight to a drone and reducing its operating range required a mix of “mathematical models, computer simulations, and experimental iterations.”

The final design for the legs uses a combination of springs and motors to mimic “powerful avian tendons and muscles” while its simplified feet use “two articulated structures” plus toes with a passive elastic joint. The toes not only prevent RAVEN from constantly face planting, they’re also critical for walking and positioning the drone at the right angle of attack for an effective takeoff.

Fixed-wing drones that take advantage of legs for short takeoffs and landings aren’t an entirely new idea. In 2019, a South African startup called Passerine demonstrated a drone called Sparrow that used a pair of spring-loaded legs to leap into the air and take flight from a standstill. What sets RAVEN apart is the complexity of its legs that allow the drone to walk across rough terrain, jump over gaps, and hop onto obstacles as high as 10 inches — in addition to being able to leap into flight.

RAVEN’s operations aren’t limited to airports or areas with smooth surfaces, which traditional wheeled landing gear requires. It also doesn’t require human intervention to get airborne again. It’s capable of landing and exploring areas that may be dangerous or restricted to humans, and then repositioning itself to an area that’s safe for takeoff. And it does it all using less power than a quadcopter drone would, giving it a larger operational range.

06 Dec 08:57

E-tattoos could make mobile EEGs a reality

by Jennifer Ouellette

A 3D-printable EEG electrode e-tattoo. Credit: University of Texas at Austin.

Epidermal electronics attached to the skin via temporary tattoos (e-tattoos) have been around for more than a decade, but they have their limitations, most notably that they don't function well on curved and/or hairy surfaces. Scientists have now developed special conductive inks that can be printed right onto a person's scalp to measure brain waves, even if they have hair. According to a new paper published in the journal Cell Biomaterials, this could one day enable mobile EEG monitoring outside a clinical setting, among other potential applications.

EEGs are a well-established, non-invasive method for recording the electrical activity of the brain, a crucial diagnostic tool for monitoring such conditions as epilepsy, sleep disorders, and brain injuries. It's also an important tool in many aspects of neuroscience research, including the ongoing development of brain-computer interfaces (BCIs). But there are issues. Subjects must wear uncomfortable caps that aren't designed to handle the variation in people's' head shapes, so a clinician must painstakingly map out the electrode positions on a given patient's head—a time-consuming process. And the gel used to apply the electrodes dries out and loses conductivity within a couple of hours, limiting how long one can make recordings.

By contrast, e-tattoos connect to skin without adhesives, are practically unnoticeable, and are typically attached via temporary tattoo, allowing electrical measurements (and other measurements, such as temperature and strain) using ultra-thin polymers with embedded circuit elements. They can measure heartbeats on the chest (ECG), muscle contractions in the leg (EMG), stress levels, and alpha waves through the forehead (EEG), for example.

Read full article

Comments

View attached file (etattooCROP-A-1152x648.jpg, image/jpeg)

Jean-Philippe Encausse likes this

06 Dec 08:50

OpenAI wants to pair online courses with chatbots

by Kyle Wiggers

If OpenAI has its way, the next online course you take might have a chatbot component. Speaking at a fireside on Monday hosted by Coeus Collective, Siya Raj Purohit, a member of OpenAI’s go-to-market team for education, said that OpenAI might explore ways to let e-learning instructors create custom “GPTs” that tie into online curriculums. […]

06 Dec 08:50

Meta Orion Through the Optics Pictures

by Karl Guttag

Meeting at CES 2025

CES is just a few weeks away (January 7-10, 2025). If you or your company want to schedule a meeting with me at CES 2025, please email meet@kgontech.com.

Introduction

Brad Lynch of the SadlyItsBradley YouTube channel let me know about Fast Company’s Inside Meta’s long-term vision to make its Orion glasses the Airpods of augmented reality article that had through the waveguide pictures of Meta Orion.

In addition to Brad’s YouTube channel, Brad and I have made several videos together, including our roundtable discussion of Meta Orion and Snap Spectacles (AR Roundtable Video Part 3, AR Roundtable Video Part 2, Snap Spectacles 5, and Meta Orion Roundtable Video Part 1).

Meta Orion Through the Optics Pictures

As I wrote in Meta Orion AR Glasses (Pt. 1 Waveguides), I was skeptical as to the image quality of Orion’s waveguide:

Diffraction gratings have a line spacing based on the wavelengths of light they are meant to diffract. Supporting full color with such a wide FOV in a single waveguide would typically cause issues with image quality, including light fall-off in some colors and contrast losses. Unfortunately, there are no “through the optics” pictures or even subjective evaluations by an independent expert as to the image quality of Orion.

Fast Company has the first and only pictures I have seen with a view through Meta Orion’s waveguides. The pictures, taken by Meta and provided to Fast Company, are not of high quality (they look like they were taken hand-held by a smartphone), but they do show the poor color uniformity provided by Orion’s waveguides. Below are the pictures as published by Fast Company:

For the next set of pictures, I have adjusted the contrast to show the color variation better.

Quoting from the Fast Company Article (with my bold emphasis:

Instead, its lenses encase the thinnest film of silicon carbide. Twice as refractive as glass, light doesn’t just bounce off the silicon, but actually flows through micro etched channels in the material to ultimately be viewable only to the wearer. It also means that wearing Orion gives the outside world the faintest iridescent glow.

This writing seems to suggest that there is also some amount of colorization of real-world view as well.

Meta embracing color issues – Hiding it in plain sight

The article cites Meta as trying to design the user interface to hide the color issues in plain site by designing icons with similar colors as caused by the waveguides variation.

This glow isn’t enough to transform someone’s face (IRL or on a call) into a confetti cake, but it is an aesthetic that the UX team leaned into across the entire interface. Or as Pujals puts it, “We embraced the boundaries.” The app icons themselves use jewel-like color gradients. And its “Aero” UI was inspired by aerog e l—the world’s lightest, semi-opaque material—with panels that shimmer with color. Think of Orion as the Miller Extra Light of hyper reality.

There are also the remarks (below) that Meta is experimenting with fonts to reduce readability issues cause by the waveguide.

In AR, subtle is trickier than overt. While Orion’s rainbow world is something of a punchdrunk buzz, a year ago, it was more like drunk goggles. Much of Pujals’s UX work has been technical in nature. Simply to make the screen legible, she’s worked alongside hardware and software engineers to render pixels properly in silicon carbide, smoothing out rough edges, eliminating strange aberrations. (The company is also working on a new version of its sans serif typeface, Optimistic, with fewer curves and deeper ink pools to be more legible in the product.)

The article also discusses the design and human factors trade-offs between resolution and FOV.

One of the biggest decisions Meta needs to make before shipping the product is around a technical tradeoff of its own screen: Will they prioritize field of view or resolution?

Meta already has a version of Orion running with twice the resolution of the demo I tested. But much like a projector works in your home, the bigger the image, the fuzzier it gets. And Meta is still mulling just how to tune their technology to consumers, to balance image expansion with clarity. (Technically, they call this measurement “pixels per degree.”)

Comparison to Jade Bird Displays Compensation Demo with Diffractive Waveguides

The color variation with Orion appears to be worse, but the brightness variation appears to be less than in my recent study of the Jade Bird Display’s (JBD) compensation demo. The JBD demo only had a 30-degree FOV versus Orion’s reported 70-degree. With diffractive waveguides, it becomes harder to support uniformity with a wider FOV.

Below are high-resolution pictures I took through JBD’s diffractive waveguide correction demo, which was used in Jade Bird Display’s MicroLED Compensation. I took these pictures against a black background to give high contrast.

Conclusion and Memories of Hololens 2

The pictures of the view through Orion’s waveguides are about what I would have expected with a 70-degree FOV single (for all colors) diffractive waveguide; in other words, they’re not very good. This also explains why I was skeptical of the reports from people given access to Orion when they didn’t mention the image quality (and a problem I have had with reports on other AR/MR products, including the Apple Vision Pro). Often, the people given access, either due to a lack of understanding or the desire to keep access to the big companies, don’t report on image quality issues.

The images through the Meta Orion optics make one wonder what Meta was trying to prove with the Silicon Carbide waveguides. In the end, it proved that with a lot of money, they could produce a low-quality image. There seems to be a lot of “non-invented-here” and a desire to do interesting research. I’m all for big companies doing interesting research, but when it comes to making something for demonstration, I think they would be better off using the best available technology.

It reminds me of Microsoft’s Hololens 2 program, which spends huge amounts of money to produce a terrible image with a laser-scanning display (see my series on Hololens 2 image quality problems) with their “butterfly” diffractive waveguide. Below is a comparison of the Hololens 2 with laser scanning and diffractive waveguides to Lumus’s Maximus with reflective waveguides using LCOS microdisplays (from Exclusive: Lumus Maximus 2K x 2K Per Eye, >3000 Nits, 50° FOV with Through-the-Optics Pictures). The image on the right likely cost hundreds of millions more to develop.

06 Dec 08:48

Riot is making a League of Legends card game

by Andrew Webster

A photo of a League of Legends collectible card game. — Image: Riot Games

The League of Legends universe is expanding once again — this time with a physical card game. Riot Games announced today that it’s developing a physical trading card game set in the League universe. The game is currently known as “Project K,” and Riot says it’s working with an unnamed partner in China to release the game there in early 2025. As for a global release, Riot says, “We are taking our time to find the right publishing partners.”

There aren’t a lot of details available about Project K. According to Riot, the game “has unique gameplay and is best when played with friends and in person,” and development is being led by director Dave Guskin and producer Chengran Chai. You can get a sense of the game in the images below:

Of course, this is far from the first spinoff from League. So far, that has included mobile games like Teamfight Tactics and Wild Rift, the Netflix series Arcane, and the competitive fighting game 2XKO, which is expected to launch next year. Not all of these bets have paid off. In January, Riot announced that it was cutting more than 500 jobs, which included shutting down Riot Forge Games, a publishing label for indie games set inside of League. Also impacted was Legends of Runeterra, a mobile card game that launched in 2020, which Riot said “hasn’t performed as well as we need it to.”

The Project K news comes as card games are having another moment, led largely by the new smartphone version of the Pokémon Trading Card Game.

06 Dec 08:18

Making mapping software user-friendly

by Matthew Hempstead

Spotted: Companies across virtually every industry will, at some point, need to make sense of complex location-based data. However, traditional Geographic Information System (GIS) software – which integrates relevant information onto digital maps to help users visualise the data – can be expensive and inaccessible, compromising an organisation’s ability to act on information and make decisions effectively. Now, Atlas.co wants to change that.

The Norwegian startup, which was founded back in 2021 by a group of then-university students, wants to make geospatial analysis simple and accessible with its browser-based platform. Technical GIS tools may be powerful, but they tend to be too cumbersome and complicated for non-experts to install and use. Atlas.co, on the other hand, is user-friendly and requires no specialised hardware or software installation.

The cleanly designed interface features a builder tool where users can easily drag and build spatial maps and interactive dashboards. Users can then upload and layer on the data they want to visualise and style it accordingly. As Co-founder Vegard Løwe explained to Springwise: “We wanted to build a solution that anyone could use to unlock the potential of geospatial data, enabling impactful outcomes like improved infrastructure planning and sustainability efforts.”

And as well as being easier to use, Atlas.co was also built with collaboration in mind: multiple users can create, share, and update maps together in real time. This kind of approach could transform the way teams and individuals tackle various challenges, including resource optimisation, disaster response, and sustainable planning. It’s also applicable across industries, whether that’s helping retailers better understand their customers in a specific location, speeding up the development of renewable energy projects, or helping to predict potential habitats for endangered species.

The company recently completed a $2 million pre-seed funding round, and Løwe shared that Atlas.co next plans to expand integrations with popular tools and refine its AI-driven features to make mapping even smarter and faster.

Written By: Matilda Cox

View attached file (89.2 KB, image/jpeg)

05 Dec 20:17

Eat the frog

Eat the Frog is a memorable, if rather gross, metaphor that means choosing the most important thing you must do today and doing that first.

It derives from a variant of a quote that Mark Twain never said, "If it's your job to eat a frog, it's best to do it first thing in the morning. And if it's your job to eat two frogs, it's best to eat the biggest one first."
— Not Mark Twain

Another way to think about it is that if you first Eat the Frog, everything else will be easier for the rest of the day. And won't that feel nice?

The Eat the Frog time management technique was popularised in Brian Tracy's productivity book Eat That Frog! Get More of the Important Things Done - Today!

Many tools in Tracy's book are based on the general premise that there will always be more to do than you can do. So, find the things that will have the most impact and do those first. The others will drift on by.

Why is Eat the Frog helpful?

I once saw someone ask for advice on getting motivated to clean the house. One person answered: "Write a book."

Sometimes, our most significant and impactful goals are those on which we so readily procrastinate. Often, these will be in the important, not urgent, bucket.

The Eat the Frog approach helps if:

You are prone to procrastination 🙋
You have a lot of things that need to be done or that you want to do
You work well in the morning

One of the nice things about eating the frog first thing is that getting it done feels really good. It helps build momentum for the rest of the day with a nice endorphin rush of ticking off something significant.

Decide on your Most Important Task (MIT) to tackle, ideally the night before. If it's too big and you can't finish it in one sitting, divide it into manageable chunks. Then start on the first chunk and leave everything else.

And sometimes, the hardest thing to do is to start. So consider doing just 5 minutes. Once you've started, very often, keeping going becomes easy.

Challenges when Eating the Frog

Questions that come up regularly in Tracy's book "Eat That Frog", including for the Eating the Frog principle, are variations of:

What is the most important thing you could be doing?
What is the highest-value activity?
What activity will have the most impact?
If you weren't already doing this, would you start doing this?
What skills will take you furthest in your career?
What are you able to do best that others can't?

It all looks pretty straightforward when I read questions like these. But I have found that answering them is usually much harder.

Our goals are often interrelated. For example, earning more may require you to increase your skills first. And getting clear on what your most important goals are first is a prerequisite and also not always straightforward.

Then, what activities are the most important or will have the most impact can be unclear. And, sometimes, small actions such as an unexpected event or connection may have out-sized benefits. Or neglecting a seemingly minor task could lead to losing a significant client. But does it still make sense to try to work on the most important task first thing in the morning? Yes.

Don't Drag it Out

Like a child leaving a brussel sprout at the side of the plate to get cold while they eat the roast, it's easy to leave our most impactful and daunting tasks until after the easier stuff is done. But it doesn't usually help to wait.

As the saying goes, if you have to eat a live frog, it doesn't pay to look at it for long.

Related Ideas

Face it, you'll never get caught up, so perhaps think about Oliver Burkeman's Rivers not Buckets
Eisenhower Matrix: important/urgent
Pomodoro Technique
The Power of Streaks
Don't make important decisions on an empty stomach
The doorstep mile
Eat the Frog is a fantastic example of putting the principles of making your idea Sticky into action
Use checkboxes
Present bias
Finishing lines
Poor frogs: the Frog boil metaphor
Another quote "from" Mark Twain appears in Big Ideas Little Pictures: "Good judgment comes from experience, and experience comes from bad judgment."

Notes

The always excellent Quote Investigator describes the source of the Eat the Frog family of quotes from the French writer Nicolas Chamfort, who said, more or less, that eating a toad first thing would help steel you against the rest of the day as nothing worse will happen to you.

05 Dec 20:15

OpenAI may be planning a ChatGPT Pro plan for $200 per month

by Kyle Wiggers

OpenAI’s “12 days of shipmas” event doesn’t kick off for a while, but the first big announcement might’ve been revealed early. A “feature gated” web page on OpenAI’s website refers to a “ChatGPT Pro” plan that includes all the benefits of ChatGPT Plus, as well as new goodies. Recall that ChatGPT Plus is an upgraded […]

Jean-Philippe Encausse

Shared posts

What is VPS?

Why do we need VPS?

Use cases of VPS

How does a VPS work?

Feature Detection

Mapping

Localization

AR tracking

How do you develop an application using VPS?

VPS Systems Characteristics

Device compatibility

On-device vs on-cloud localization

Indoor vs Outdoor

Map scale

Openness

Pricing

Available VPS Systems

Conclusion

Dumb War

ASAT Stats

Meeting at CES 2025

Introduction

Meta Orion Through the Optics Pictures

Meta embracing color issues – Hiding it in plain sight

Comparison to Jade Bird Displays Compensation Demo with Diffractive Waveguides

Conclusion and Memories of Hololens 2

Why is Eat the Frog helpful?

Challenges when Eating the Frog

Don't Drag it Out

Related Ideas

Notes