Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
BodyPix: Real-Time Person Segmentation in the Browser with Tensorflow.js (tensorflow.org)
167 points by mrburton on Nov 18, 2019 | hide | past | favorite | 43 comments


This seems like it can be used to lower the bar for (repressive) governments to identify people.

E.g. by making it easier to distinguish parts of the body, identification via them, or gait analysis, becomes easier.

Are there compelling use cases that make this risk worth it?

Or are we just producing tech because we can?


This update is for the 2.0 version.

It's a shame there isn't a Python TF implementation of this (apparently due to a different file structure between the tf.js model and TF SavedModel?), and it doesn't sound like it's a priority for the team.


TF 2.0 in general (SavedModel included) has been a giant clusterfuck so far. As we used to say at Google "the old thing is deprecated, the new thing is not ready yet" - internally Google is pretty much entirely like this. I get that they had to ship it sooner or later, but IMO they shipped 2.0 a bit too soon. Shipping sooner is fine if you fix the bugs quickly, but that's not really happening either - some rather potent bugs remain open for half a year or more.

So at the moment you have a choice of working with a version that's officially EOL (1.15) or using a version that has several critical features broken or missing, and on which TF Model Zoo doesn't work. Oh, and EOL version is also busted beyond belief. Worse yet, Google does not reward maintenance work, and since this is considered "launched", you can count on the bugs being there for an extended period of time until Google itself moves to 2.0, which I'd wager is unlikely to happen anytime soon. Facepalm.


The best part is the whole Tensorflow-Swift, with the halfbacked support for anything that isn't an Apple device.

Meanwhile Julia works just fine on Windows.


Problem with Julia is that approximately nobody wants to learn yet another language just to do pretty much the same things that Python can do with the help of e.g. PyTorch or NumPy. Us programmers often overestimate the level of enthusiasm non-programmers have for learning programming languages. Heck, I'm kind of also running out of enthusiasm in this area myself.


Julia should be a much easier transition for a python programmer than Swift would be. If your argument is that swift can just be called from python so that the python programmers don't have to touch it directly, then I'd say:

1) That's not the most robust solution, especially for deep learning because auto-diff tools don't cross language barriers well

2) If you have an argument for why 1) isn't a big deal, then the next step would be demonstrating why that argument doesn't apply to julia just as well (if not better) than Swift. In fact, Julia has a great interoperability story with other languages, one good example is DiffEqPy and DiffEqR which are Python and R packages respectively for accessing julia's state of the art DifferentialEquations library.


Still better to learning a language that only works properly on Apple platforms, without any kind of eco-system for scientific computing.

I don't see Tensorflow-Swift going anywhere.


I agree. I looked into it briefly (since I sorta already know Swift anyway and like the language) but I can't even run this stuff on Linux easily, so it's kind of useless for me so far. This doesn't in any way mean that Julia has a snowball's chance in hell of taking off though. Besides the language, Python had a decade of head start in terms of "batteries included". And people have gotten used to that particular brand of batteries.


Is Julia likely to become as massively popular as Python? Probably not. Is julia likely to become much bigger than it already is? Yes, I'd say so.

Julia's value proposition is mostly geared towards people trying to do things that are just too difficult, awkward or slow in other languages. It's not just about speed, dynamism and friendly syntax. It's also about solving the expression problem and providing unprecedented composability.

So while it doesn't have great appeal to end users, it does appeal to people who make the sort of things end users want so the community is currently swelling with very talented people making state of the art research libraries. Honestly, whether these libraries attract hordes of 'scripters' or not, doesn't really matter to julia's usefulness to the people who use it currently.

But I think that as julia's package ecosystem evolves and matures (We've only been at 1.0 for a year!), it's going to get more and more attractive to end users.


I'd like to add that it took the Julia community itself quite a bit to understand what this "value proposition" actually is, especially that it is not just a very nice high level language language geared towards linear algebra that is fast as C and allows some interaction with the type system!


Yes, I fully agree. You can even see the vestiges of this in julia's own official website: https://julialang.org

I cringe a little every time I see a julia advertisement that centres it's premise around "use julia 'cause it's fast!".

I love how fast julia is, but that really isn't the important part.


Julia will add pressure for Python folks to actually take PyPy seriously.

There is a large subcommunity that doesn't enjoy being forced to drop down to C for anything performance.

Julia already counts some banks on their list, that tend to use ML languages instead of Python due to performance.


I feel like a lot of performance could still be extracted from Python libs everyone is using. I.e. NumPy should probably use MKL on Intel out of the box - an easy 2x improvement in performance. On dense linear algebra MKL is pretty impressive. All the stuff that has to happen in Python will be slow as molasses, though, no question there. Parallelization is also currently much harder than it needs to be. I have a bit of code that does elastic deformation on images. It uses a single core for whatever reason (it does call out into numpy and scipy, obviously, this is not done in Python). All my options for parallelizing this suck mightily. I might end up doing exactly what you're opposed to: drop it down to C++ and use pybind, because currently this stuff is single handedly responsible for quadrupling the duration of training runs, which weren't super fast to begin with.


PyPy can't get you all the way there in many cases. Unlike Julia. The real problem with PyPy is Python's semantics. They preclude many essential optimizations to get those final few drops of performance.


So basically everyone should move to pytorch and ONNX?


ONNX is a turd in is present state as well. Too many unsupported ops and you can't really run ONNX on e.g. a phone or really anywhere. You could theoretically convert it into something that does run where you need it to run, but then you have to deal with the converter's supported ops as well, and design your net around the intersection of the two sets of ops and their parameters. If you're doing anything more complicated than basic classifiers, you're going to run into issues.

That said, I think in a few years with enough elbow grease ONNX could become pretty exciting.

As to what people should use for research: just use PyTorch. I switched to it when it first came out, and never looked back, _except_ when I need my stuff to run on e.g. a phone or something, which is something PyTorch won't help you with.


The best thing would be if everybody converged to the same framework.


When I was a lot younger, I used to play softair sometimes with my friends (wargames with BBs gun). I seriously researched ways to build automatic turrets.

Rules are simple: shoot at a foe until it raises its hand, that means the person was shot.

Unfortunately (?), the available software at the time was subpar, and I forgot about it because of two problems:

* How to identify a foe? It would need facial/body recognition from a few set of examples (team pictures before starting, better if it shots everything else).

* How to know when to stop shooting?

These new technologies would make almost trivial to build such an application, and make it quite reliable. I guess there is a market for this in softair... I'm even tempted to have a go at it, BUT... I don't think I will because I now have to ponder the ethics of building something that can take a gun, identify targets, aim at them accurately, and fire until they are down. I am usually all in on open source software, but that's just the kind of things that sound dangerous to share (same as defense distributed, TBH).

I hope enough persons share the same misgivings, otherwise there is no point for me to refrain from building this. (Except perhaps spending my time on project that are actually useful for humankind).


Since I quit smoking, I noticed how many tv-series and movies are loaded with cigarettes, alcohol and pharmacy pill consumption. Almost every social event good and bad on screen is dealt with by drinking a shot of liquor. Is this my personal paradigm or are we constantly fed with what we should view as our western culture?

I wonder if it is possible to make an estimation per actor / producer / studio how many alcoholic, drug, cigarette they consume on average. Would be awesome to see if there is a correlation with project budget. And maybe even more important is there an increasing trend of average number of drug consumptions per hour.


comment on the wrong article?


> Person segmentation

Consider the defense applications!



I think they meant disassembling the actual persons. >.>


I believe gp's point is that Dazzle camo should do pretty well to defeat this system.


And the potentially undetectable aimbots for games - Perfect Headshots Everytime (TM).


JavaScript is about my fifth favorite language - way down the scale of languages that I enjoy using.

That said, I think TensorFlow.js is awesome on so many things: examples are easy understand and the build system makes them easy to run, many possibilities for using trained models in browser based apps, on my Linux GPU deep learning box training models is fast, great documentation, etc.

In some ways I like the TensorFlow.js ecosystem better than the C++/Python ecosystem.

So far, Swift TensorFlow has been a disappointment for me, but every few months I check it out again.


What can run client-side will run client-side.


Can this potentially be used in 2D animation replacing the manual skeletal animating processes?


That's how we did this; https://m.youtube.com/watch?v=7jfZ6zLkWF0 (was running live on-set)


That's so impressive!

What company do you work for? This is super cool.

Can you talk about the rest of your stack? Or other projects of a similar nature?


http://newchromantics.com/ is my own little company, (website is criminally out of date) I did the tech, http://analogstudio.co.uk/ did the art/comp/post work.

The setup was essentially an app (c++ engine on osx running models), ingesting camera feeds, outputting skeletons, running sound->mouth poses, to unity which puppeteered 2D sprites.

On-set we output to a monitor which mixed the real shots with our graphics.

Nothing crazy! But it was a solid setup (except for overheating DSLRs)


That’s so epic! Can you please give a little bit more detail on your process?


That doesn't look like it needs person segmentation.


No, but the post above was about skeletal tracking, not the OP content.


Really cool ! Tensor js ?


Do you mean 3D animation?

It looks like the interface returns 2D coordinates for the joints, not 3D coordinates; so it can't directly create 3D animation data.

It was also very jumpy in my browser, so it would require smoothing to be usable; or maybe if it wasn't running real-time it would be more accurate?


is there a serverside version of this. I've been looking for somethign similar in python all day. Actually seems a little buggy seeing this test -> https://superanimo.com/bodypix/


Does this work in Safari for anyone? I've tried on iMac, iPhone and iPad.

Only Firefox/MacOS works for me.


Cool demo but how can it figure out that all 3 guys are wearing briefs?


What's weird about the pelvic area coloration is that as far as I can tell it doesn't line up with any of the body parts they say the segmentation is looking for (https://github.com/tensorflow/tfjs-models/tree/master/body-p...)


Wow that's impressive.

Is this library MIT license or something else? I would try using this to cutout people from images/gifs for my animation platform SuperAnimo:

https://www.superanimo.com


Apache (which is sufficiently permissive), like most open-source Google projects.


Meh - it didn't work that well when I tried it. [1]

Params I used were {architecture: 'ResNet50', outputStride: 32, quantBytes: 2}, with segmentMultiPerson function.

The same images DeepLab handles perfectly. [2] If anyone sees a problem with the params/method I used let me know.

[1] https://superanimo.com/bodypix/

[2] Deeplab: https://github.com/tensorflow/models/tree/master/research/de...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: