• After 15+ years, we've made a big change: Android Forums is now Early Bird Club. Learn more here.

Help 3D camera speculations: what new possibilities?

Why is it that you have a camera capable of capturing approx 25-30 frames per second for video but not able to take individual pictures at that rate? Is it just that the write time for new files for each photo is slow?

Its not so much the write time for a new file is slow as it is for the large difference in size of video vs 30fps pics.

Better phones will record 1080p30 (around 2MPixels resolution per frame) at around 10-15Mbps. In comparison, If you wanted to take 30 8MPixel pics in a second you would have a bit-rate of (assuming 4MB per pic) 960Mbps.

You may have noticed that the resolution per frame for the pictures are only 4 times larger than the video but the bit-rate is much much more than 4 times larger. This is because each frame is much more compressed in video than it is for pictures.

A transfer rate of 960Mbps is just not feasible.
 
Its not so much the write time for a new file is slow as it is for the large difference in size of video vs 30fps pics.

Better phones will record 1080p30 (around 2MPixels resolution per frame) at around 10-15Mbps. In comparison, If you wanted to take 30 8MPixel pics in a second you would have a bit-rate of (assuming 4MB per pic) 960Mbps.

You may have noticed that the resolution per frame for the pictures are only 4 times larger than the video but the bit-rate is much much more than 4 times larger. This is because each frame is much more compressed in video than it is for pictures.

A transfer rate of 960Mbps is just not feasible.

So, in essence, it would be better to have an app that can just extract frames from the video as opposed to trying to capture individual photos? I wonder if someone has done that.

Or, alternately, maybe an app that takes pics at a lower quality, but higher rate of fire so to speak.

Or am I just talking nonsense here? I really don't use a camera much, but am intrigued at the possibilities surrounding the dual cameras.
 
So, in essence, it would be better to have an app that can just extract frames from the video as opposed to trying to capture individual photos? I wonder if someone has done that.

Or, alternately, maybe an app that takes pics at a lower quality, but higher rate of fire so to speak.

Or am I just talking nonsense here? I really don't use a camera much, but am intrigued at the possibilities surrounding the dual cameras.

Exactly why I ask! :D
 
So, in essence, it would be better to have an app that can just extract frames from the video as opposed to trying to capture individual photos? I wonder if someone has done that.

Or, alternately, maybe an app that takes pics at a lower quality, but higher rate of fire so to speak.

Or am I just talking nonsense here? I really don't use a camera much, but am intrigued at the possibilities surrounding the dual cameras.

It depends on what your goal is. If your goal is to get the perfect picture out of a series taken in rapid succession than yes that will work, but as always there are some disadvantages. Also this doesn't really have anything to do with 3D, its all using 2D video.

This website goes over the whole process assuming the use of an Olympus camera. If you ignore all the specific information to the Olympus camera it gives great general info.
Lightning in a Bottle: Extracting Still Photos from your Video Files

here's a blurb from the site about the disadvantages:
"Disadvantages of grabbing video stills
Grabbing still images from digital video is a nice alternative for photographers wanting to take full advantage of their hybrid still/video cameras. These techniques make it easier to grab the perfect expression during a portrait setting, for example, or capture a fleeting moment in time, but there are limitations to this method. Here are a two considerations you should be aware of when extracting still images from video.

- Reduced Image size: The 720p HD video footage captured by the E-P1 and E-P2 provides an image size larger than needed for standard DVD output or Web video, but may not be large enough for a high-quality print. 720p video contains pixel dimensions of 1280x720. This corresponds roughly to a 5 1/3 x 3 inch print at 240 pixels per inch. Using software like Photoshop, you can upsample the image for a larger print size, but the image quality may suffer.

- Compressed Footage: As mentioned earlier, the footage captured by the E-P1 and E-P2, like all other hybrid cameras currently on the market, compresses video footage to save space on your memory card and make your video files easier to edit. As a result of this compression, you have fewer options for post-processing your still extracts than you will with a camera raw still created with the same camera. "
 
You can expect added features to be available other than standard 2D and 3D capture on EVO 3D. Burst/Sport mode will utilize alternating camera lens. The other mode will be panoramic effect once again using both lenses. You can expect further details on features coming with EVO 3D about dual lens options available on June 3rd.
BSOD
 
You can expect added features to be available other than standard 2D and 3D capture on EVO 3D. Burst/Sport mode will utilize alternating camera lens. The other mode will be panoramic effect once again using both lenses. You can expect further details on features coming with EVO 3D about dual lens options available on June 3rd.
BSOD

What else can you tell us BS!?? We are starving for info here!
 
Here are the possibilities...

You can take 2D pics

You can take 3D pics

You can take 2D video

You can take 3D video

That about sums it up. Don't expect much gee-whiz beyond that...
 
It seems avisynth is down at present.
Sorry about that; it seems to be up now.

You seem to be describing an attempt to take frame rate processing (per your use of the phrase, spatiotemporal analysis, something I've noted here and there) (aka 120/240 Hz processing on better LCD HDTVs) and apply it to blend stereoscopic images as opposed to successive frames. In either case, the processing steps are the same - except - frame rate processing assumes that the motion components are displaced by a maximum of 1/24th second - and that more than 2 successive frames are used to resolve the images in time.
1/24 second can nevertheless entail a huge displacement. Also keep in mind camera jitter. This is what my method takes into account. It is harder when there is a temporal offset in addition to a semi-random spatial one than when there is only a fixed spatial one.

For what you propose, let's assume the two lenses are apart by at least an inch. Even at the lowest frame rate algorithm you're going to use, you're asking for the known algorithms to reconcile something with an apparent motion of 2 ft/sec - or over 1.3 mph, moving laterally.
This is a non-issue. In fact, this would be much easier to reconcile because of the fixed distance. You never how how much motion will occur in that 1/24 second, and different objects move at different speeds that require discrete instances of motion vector fields of adaptive cohesion. Plus, the objects can "move" in entirely unpredictable ways (complex spatial transformations such as object rotation, changes, overlaps, new objects 'appearing', etc) With two lenses spaced apart, simple optics and a depth-of-field analysis is substituted, which yields significantly greater speed and accuracy than mapping actual motion. Motion has a high entropic component. 120Hz TV's take advantage of the fact that our eyes lose a great deal of spatial accuracy in high motion, high-framerate scenes, so they can get away with the simplest trick in the book, luma-weighted motion blur, to fool the eyes into thinking the motion is smoother.

Also, you say "laterally," but it is a function of the phone's tilt, because the cameras won't always be on a horizontal axis (the method I outlined takes this into account as well). In any case, if motion directly in front of the lens would have a displacement of "2ft/sec" in the framerate you specify, this is fit into the depth function as are other vector fields of a threshold cohesion, and these planes are sampled to create a depth function around which a convolution or further processing is fairly trivial.

Your contention is that the existing algorithms can deal with that level of spatial uncertainty? Of course! First of all, let's clear up some misconceptions. What makes you think it's uncertainty? Of course there will be some level of uncertainty, but this uncertainty can be minimized to an infinitesimally small amount in most cases. Again, a threshold SAD between vector fields accounts for vector uncertainty by default. There is much less to be "uncertain" of when all displacement is fixed as a calculable function of perspective. There is a much higher degree of certainty when the only component of "change" is a fixed spatial one.

And if so, you find that trivial? On this processor?
Running Ubuntu on the HD2 (1GHz 65nm Scorpion), an unoptomized ARM compile of x264 can perform motion search components, transform, and compensation as part of its encoding process at nearly 2fps. If the costly encoding process is factored out, the speed increases to over 10fps for the me/mv processing components alone. Obviously, a dedicated, optimized processing function would run significantly faster -- particularly if GLES 2.0 shader accelerated -- and optimizing specifically for the case of spatial-only binocular processing, it can feasibly be an order of magnitude faster still.

Not with two images only - the uncertainty for frame rate algorithms with that little data is rather large.
This is a misconception. Temporal entropy trumps fixed spatial referencing in terms of the uncertainty introduced. See above.

And at root in this issue, despite whether you de-multiplex space or time, is that the images have significant differences to solve in order to attempt to resolve greater detail.
Which is exactly what the algorithm I suggested does. ;) It's not an either-or proposition. To demultiplex the two such that only "space" is held immutable while time passes is nearly impossible, obviously, because minute environmental changes will almost certainly occur -- depending on your reference frame -- if there is any motion at all in the scene you're capturing, any camera jitter, cloud movement, leaves blowing in the wind, etc. On the other hand, it is possible with 2 synchronous cameras to hold time constant. Instead of a spatiotemporal analysis of the two frames, it's only a spatial one. And because of fixed lens positioning, known focus and calculable tilt, further analysis becomes at once both faster and more accurate.

In the video "increased framerate" case, to clarify, if the satd between a vector field and its proximal frame-parallel counterpart is higher (more unresolvably different / less certain) than that between it and its distal frame-temporal neighbor, distal frame interpolation (if even the simplest pixel blur as found in the 120Hz TV's) will be used on the [mask blended] region indexed between that field and its distal frame-temporal neighbor, and the original corresponding high-satd region of the former simply discarded since it couldn't be accounted for. Another field(s) in that same frame (whose respective ratio is less than 1) can pass through with perspective adjusted merge without distal blending or temporal interpolation. This yields a video containing sequential frames containing what is known from the other. The overall effect is
still quite a bit more accurate a "60fps" than if you took a single 30fps video and mangled it on a simplistic motion blur interpolation.

And absolutely level is not an option, you've traded time for space, it's a hard requirement.

Yes or no?

Not really -- motion-compensated framerate interpolation is a far cry from interframe blending performed by TV's, from which you've based most of your assumptions. I can totally understand your confusion if I were to view the entirety of my algorithm from the perspective of a mere "motion blur" alone for the analysis step. Responses in bold within quote.

If you're skeptical that the algorithms could work at all, send me a few (5, to be safe) high-ISO (or somewhat grainy) pictures of the same scene (with some reasonable interval between shots; say, a second) and a video shot with your phone, and I will send you their resultant processed counterparts (degraining and framerate interpolation in this case) via two methods. Since there aren't 2 cameras for time-fixed correlation detection, the accuracy suffers significantly in this temporal-only case, but as a demo I think it's worth proving the point.
 
Ok, many thanks. I've read a number of the frame rate processing monographs (and fully understood them) so I'm quite familiar a priori with the range of techniques proposed for television - they do run the gamut and go well beyond simple interpolation as you describe, if I understood you correctly. (I have some small experience in image processing but am more than well versed in signal processing and its algorithms, so I could follow the material easily.) Probably unless otherwise stated, when I say uncertainty, I'm referring to a mathematical attribute, not a colloquial description.

So, rather than me send you pictures, you must have already some set of test images that you used to validate your work?

It would seem therefore easier (and I have no reason to distrust) for you to kindly share with us, left, right, blended photo sequence.

If you'd like to keep it private to our forums as the moment but not hit image-attachment limits, I think you'll find you can attach a pdf or pdfs of your images.

Would that be acceptable?
 
Not really -- motion-compensated framerate interpolation is a far cry from interframe blending performed by TV's, from which you've based most of your assumptions. I can totally understand your confusion if I were to view the entirety of my algorithm from the perspective of a mere "motion blur" alone for the analysis step. Responses in bold within quote.

If you're skeptical that the algorithms could work at all, send me a few (5, to be safe) high-ISO (or somewhat grainy) pictures of the same scene (with some reasonable interval between shots; say, a second) and a video shot with your phone, and I will send you their resultant processed counterparts (degraining and framerate interpolation in this case) via two methods. Since there aren't 2 cameras for time-fixed correlation detection, the accuracy suffers significantly in this temporal-only case, but as a demo I think it's worth proving the point.

Whenever I see your posts, I wonder what you do for a living... :-p Did you invent Photoshop or something?
 
Could work but with the current phone only one could view the 3d of the other person at one time because u would have to face the 2 cameras on the back for the other to view u in 3d maybe next phone they can put dual front facing cameras so both could see eachother in 3d at the same time this would be pretty cool IMO. I guess both people could 3d video chat and see eachother at the same time via evo 3d + 3d tv but then both people would have to wear glasses which would be lame. The 3rd dimension adds so many possibilities can't wait to tinker with mine.
 
Probably unless otherwise stated, when I say uncertainty, I'm referring to a mathematical attribute, not a colloquial description.
Yes, as I described, a SAD threshold for a vector that exceeds either a certain fixed threshold and/or constitutes a ratio such as the type I described is one of the most often used measures of 'cost' for a particular transform, which is an inherent metric of "certainty" when the difference cannot be compensated in ME.

So, rather than me send you pictures, you must have already some set of test images that you used to validate your work?

It would seem therefore easier (and I have no reason to distrust) for you to kindly share with us, left, right, blended photo sequence.
I have only worked with single cameras, not two simultaneously. The techniques I described do indeed work for that, but my proposal is that the second camera would dramatically increase the usability and accuracy of such operations (and other potentially neat ideas). I guess I'll provide a few samples then. No need for privacy or the like of course.
 
Would a 3d photo taken with this phone be able to be set as the phone wallpaper & be seen in 3d as a wallpaper
 
Would a 3d photo taken with this phone be able to be set as the phone wallpaper & be seen in 3d as a wallpaper

Probably not -- for one, the stereo effect only works in landscape mode, but the main ('desktop'?) user interface is largely set up for portrait mode. They've also said that the UI won't be in 3D. (Plus, having those UI elements hovering over a stereo photo would look a little weird.) Not that it's impossible, but it seems unlikely.
 
to other EVO 3D users, will they be able to view them in 3d?

Sorry if this has been asked before.
 
i am totally guessing on this, but, i would have to assume so. After all the files are being saved in a certain format for the phone to read, so why can't the same type of phone receive that particular file format and then read it on that device.

This is all assuming that you can send the 3d files over Email or pic/video messaging.
 
Sorry if this has been discussed but I haven't seen it covered. I want to know if the camera is actually a downgrade going from 8 mp to 5. It really is affecting my decision to buy or not. With some tweaking I got my original evo to take some really nice pix. I don't want to go backwards
 
Short answer: No, it's not a downgrade.

Long answer: Search out novox77's post on cameras. But it stands that anything over 2 megapixels is overkill for a cell phone camera, anyways. There is a picture segment in the review done by wirefly in the sticky on top of this thread, too.
 
Sorry if this has been discussed but I haven't seen it covered. I want to know if the camera is actually a downgrade going from 8 mp to 5. It really is affecting my decision to buy or not. With some tweaking I got my original evo to take some really nice pix. I don't want to go backwards

Short answer is no its not necessarily a downgrade. Resolution is only a small part of picture quality. IMO the order of importance is this:

1) Lens (by a huge margin)
2) Sensor (includes sensor size and pixel density/resolution)
3) Signal Processing

Resolution is only important to picture quality when you are printing out pictures in larger formats.

Edit: Didn't see jerofld's post before I made mine. I disagree with anything over 2MP being overkill for the newest cell phones.
 
The EVO uses the same sensor as the iPhone. Hence HTC went with a 5mp camera because the iPhone had better pic quality with a 5mp camera when compared to the 8mp camera on the EVO.
 
As the others are saying, it's more about quality of the sensor. The number of MP is a side issue to overall quality.

If I two super-great sensors made by same guy, same series, and one is x (some number) MP and the other is 2x MP then the 2x MP is probably better.

If I have a doggy sensor at 8 MP and a great sensor at 5 MP, the great sensor will win every time, it'll just have fewer dots.

Where does the greatness come from if not MP? How big (how much light can it get in), how sensitive (can it really go to town on light), how noisy is it? (Hey, it's a digital doodad converting kinda analog (light) to digital - it WILL have noise.)

People like to call this the megapixel myth in digital photography.

Early results indicate the 5 MP in the 3vo may outclass the 8 MP in the Evo, for example.

See also, as suggested earlier, one of novox77's posts for example for more info: http://androidforums.com/htc-evo-3d/305800-what-if-no-3d.html#post2552153

The EVO uses the same sensor as the iPhone. Hence HTC went with a 5mp camera because the iPhone had better pic quality with a 5mp camera when compared to the 8mp camera on the EVO.

No disrespect, but do you have a part number on that? It's not listed on the schematic.

FWIW -

OmniVision OV8812- Camera in HTC evo 4g.

OmniVision OV5642- Camera in the Apple iphone 4g

OmniVision OV2640- Camera in Blackberry storm and curve.
 
Back
Top Bottom