Hello Everyone,
This is in continuation of the last week’s post: LINK.
Following last post, we all can see a lot of reference on 4 Cores vs 2 Cores. 2 Cores does look underwhelming, right? Let me ask you another question, there is a bike with 2 tyres and a car with 4 tyres, which one will go faster? Answer for both the question somewhat related.
What do you think Mr. Amdahl?
“If for a given problem size a parallelized implementation of an algorithm can run 12% of the algorithm’s operations arbitrarily quickly (while the remaining 88% of the operations are not parallelizable), Amdahl’s law states that the maximum speedup of the parallelized version is 1/(1 – 0.12) = 1.136 times as fast as the non-parallelized implementation.” – WikiPedia
Let’s check this image:
You can see the 4 cores here, but you can also see HD Video Decoder and Encoder blocks. Now let’s see this image:
Observe few things here. We are comparing 2 (dual) vs 3 (quad) here (for power). We are focusing on Video power saving, which should be handled by HD Video Blocks. There is a separate Audio, Image, HDMI, Display and really awesome GPU Blocks.
So where does the 4 cores actually help? Or to be contextually correct, where have we added parallelism to use these 4 cores? More over there are other SIMD which when code is optimized (NEON) does acceleration. There is this brilliant article on “death of cpu scaling” which you must read. I think it will be safe to say in current context that adding more cores to GPU will make much more sense unless we see OS or an API which lets developer use these cores.
If all the video, audio, imaging, graphics, etc processing requirements are taken away from CPU, it must now be mostly responsible for the Operating System Demands (you are aware that ICS uses Hardware Acceleration for its User Interface, which is again another block outside CPU). Android must have the answer for this.
RenderScript is finally (Cuda is still not available on embedded devices) (Dalvik is Closed source, so I can’t comment on it) a great way of using the Parallelism available with the number of increase in cores. RS does two tasks, compute and graphics. Graphics is on GPU and Compute is on Core. So if you want to compute then you can use these cores. Current applications would be linear algebra, Fourier transforms, n-body problems, graph traversal, hidden Markov models (eg. speech recognition), and finite-state machines. We’d love to see SoC manufacturers or Open Source community come forward with APIs which developers, students, professionals and hobbyists can comprehend easily and actually improve performance of their applications.
We have no idea on the amount of Parallelism available on Android, but yes, in marketing and on paper, 4 does look AWESOME! 🙂
We were also to cover some of the blocks on OMAP. Here is the image again for the reference:
Let’s talk a bit on IVA today.
IVA stands for Image and Video Accelerator and as expected it does lot:
- 1080P video
- Slow Motion Camcorder
- Real-time Transcoding up-to 720p
- Video Conferencing up-to 720p
Other interesting components are:
- Video DMA Processor
- Shared L2 interface and Memory
- Motion Estimation Acceleration Engine
- Entropy coder/decoder
- and much more.. Check out the TI OMAP TRM for more information.
Why do we (hw designers, programmers and interested users) know about IVA? Because this is the part which will decide what all video formats we can play? As a normal user, max we know about a video is that it is mp4, avi, dvi etc. We let manufacturers claim (including us) that we can do 1080P and when we get the device we realize that there are multiple variants of these formats technically called, Constrained Baseline, Main, High, Simple, Advanced Simple Profile and many more. Not all formats are free, some have royalties (like MPEG) which in the end will add to the overall cost of the device. Also, not all formats are supported and but SoC vendors will share details on what all are enabled and what OEMs should work on. This time we know what all formats are already working and what we need to work on. We will share all profile support and not just 1080P since this is very important for the end usability, cos frankly user doens’t give a damn, his video should “just work” 🙂
I think it becoming a long post, so let’s stop here. Next time we will cover ISP and more.
Warm Regards
Rohan Shravan
.
Good luck to you Rohan. I hope adam 2 will be more successful than adam1
So the point is shouldn’t we look at Linux/Ubuntu port for Adam1/2 which can throw some light on Alison to advantage of the cores?
can we get a peek on the performance graphs comparison between the processor in the ipad and the OMAP 44XX
Alison = APIs, auto correction on Adam, lol
Ok.. It took some time.. but now i am getting it… 🙂
All the best for adam 2.
keep it up.
Potentially dumb question…
“Slow Motion Camcorder”
Does this mean higher frame rates to achieve slow motion as well as normal speed?
or
Does this mean a low performance camcorder for simple recording?
Take the video at very fast rate say, 90fps, so when you play is down to 20fps, it looks smooth. I think 120 can also be achieved, but with loss of resolution.
Regards
Wow!
Sweet!
That is what I HOPED it meant!!!
Thanks, Rohan!
Keep the info coming, I am learning so much.
😀 😀 😀
Rohan please correct the mistake.. you mention “2 (dual) vs 3 (quad)”. It must be 4 (quad).
Its right actually. We were comparing Tegra 2 and Tegra 3, former is dual core and later one is quad core.
Regards
Rohan, in this post are you trying to say that we do not have a software that can take full advantage of the extra cores? You asked when will the cpu come in to play when you have hardware units to do independent task. I think the cob will play a major role when you have multiple threads running in the OS. I am not sure if ICS is multithreaded but Windows definitwly is. IOS is not multirhreaded and hence they have that brilliant performance. I thought Android does the core management pretty well. Yes the hardware video decoder can’t decode all formats. I am sure you will have to buy the codecs from companies that sell codecs as IP.
🙂
what your saying makes sense from a tech stand point, most of the people following you right now will understand what you are saying about 2vs4 cores. but if your trying to sell the average consumer, you will have harder time with a 2 core tablet. the standard person is going look at the spec sheet, see two options, two and four cores, and most likely they will pick the four core thinking its twice as powerful even if yours is ten times more powerful.
its all about the demographic your targeting. if your aim is to sell to the limited techies or the infinite masses. i’m a techie my self, so would love to see your new tablet in action and there’s a high likely hood i will buy your adam 2. but i’m concerned from a business and human nature standpoint. i just want NI to grow prosperously,
i’m just giving you more data, before setting out on an endeavor i thought you should be aware of both sides of the coin.
PS
Sorry for sounding negative. I look forward to having my own adam 2. I will support notion ink as much as I can, weather it be buying your products or giving you my 2cents. I said it before I just want NI to succeed.
Hi,
Thanks for your support! I understand your point! It is too early to say what we are doing right now, so will wait and seek your opinion again when we are ready. 🙂
Regards and have a great day!
I agree to both Roopesh Nair and Ahmed Elkady.
You can’t teach all customers about these stuffs. Most of the people (including myself) first look specifications. How can you expect them to know all this?
2nd reason why you should opt for quad cores is high possibilities and better performance on porting other operating systems. TabletRoms has major role in the success of NI. and you can see there, many of yours customers are working hard to port LInux/Ubuntu. Some have even talked about porting Windows in the past, which might again be a hot topic when “Win 8 tablet edition” will be released.
I don’t think quad core will be a waste in that case or Will it ? (I don’t know for sure, because until today, i was not even aware Cores and Android relationship.)
I’m really excited to see VP8 on the OMAP:
http://blog.webmproject.org/2010/10/demo-of-webm-running-on-ti-omap-4.html
http://blog.webmproject.org/2012/01/vp8-codec-sdk-duclair-released.html
The WebM team have landed a lot of improvements to their realtime encoder, and with the input video element coming up in Firefox, good times are ahead.
Here’s to you guys, Rohan.
Rohan,
I think it is important to mention Gustafson’s law if you bring Amdahl’s in the discussion. While Amdahl’s law is fairly simple and true in its own context, motivation for parallelism lies in the fact that as the size of the problem grows, the part of the code that cannot be parallelized continues to remain constant (or grow slowly) while the part that can be parallelized increases significantly, and hence adding more processors does improve speedup.
The speedup plot you showed shows and example of 1hr (sequential) vs 19hrs (parallel) on 19 processors/cores and 20x maximum speedup as a result. Consider a case of a larger problem solved with 4hrs (sequential) and 495hrs (parallel) on 495 cores (ideally solving in 1 hr), then the total execution time is 5 hrs (parallel) against 500 hrs (sequential) and a speedup of 100x. Increasing the core does increase speedup, if you have the right problem. Or, did I (or Gustafson for that matter) miss something?
Well, how to make a fair portion of your code parallelizable and keep a smaller portion of it to be sequential? I think many of us are working on that, too. I am sure it won’t be too long before multi-core handheld devices and embedded systems will have parallel computing capabilities. I am glad Adam 2 is going in that direction. 🙂
that asks what would be the typical workflow on Adam (on tablets in general)
Yes, if i transcode FLACs to OGGs, i can utilize as many cores as i wish, if RAM is fast enough. I would just transcode 10 files in parallel. But would it be typical for tablet ?
Single bottleneck can turn all Gustafson to Amdahl
> in current context that adding more cores to GPU will make much more sense
i guess u meant “would NOT make much sense” 🙂
i wish there could be codecs infrastructure on Android, like DirectX Media or GStreamer or OpenMAX IL
so that you make hardware support and user then can select any front-end player for him
i figure you’re well into development and its too late to make changes, but one thing i would like to see in a tablet that i don’t think ANY have is the ability to use it as a monitor, via an hdmi in. That would rock, its functionality would be greatly increased and that might be the next niche feature that could sell adam 2’s… similar to the pixel qi on the original. The tablet could then be used in conjunction with a blueray or dvd player, it could function as a monitor for a game system, you could watch tv on it with an external tuner, it could be a reference screen when giving presentations if the tablet itself couldn’t be used for the presentation, and more possibilities i’m sure…