一个老外写的,后面我补充下:
Compiling CMEM for the Beagleboard…
Filed under:
Beagleboard,
DSP,
Linux,
OMAP3530 — Tags:
Beagleboard — Nils @ 10:51 am
Since I tend to forget these things, here’s a little tutorial how to compile the Texas Instruments CMEM and SDMA kernel-modules for the beagleboard. I don’t like the codec-engine build process, therefore I’ll compile the kernels by hand.
So what’s CMEM all about?
In a nutshell CMEM is a kernel-module that allows you to allocate contiguous memory on the OMAP3, map this memory it into the address-space of a user-mode program so you can read and write to it.
CMEM also gives you the physical address of these memory-blocks.
This is important if you want to share some memory with the C64x+ DSP as the DSP has no idea what the memory manager of the Cortex-A8 is doing. It also allows linux user-mode programs to allocate memory that can be used with DMA.
Things you need:
- The sources of the libutils from the TI website (registration is required but free). I’ve used release 2.24 which works fine with my 2.6.29-omap1 kernel image.
- The linux kernel-sources for the beagleboard. If you use OpenEmbedded and you have already compiled an image you’ll most likey find them at $OE_HOME/tmp/staging/beagleboard-angstrom-linux-gnueabi/kernel/.
- A cross-compiler toolchain for ARM. I still use the CodeSourcery 2007q3 light release. Works for me.
- A beagleboard. Also not strictly required it makes perfect sense to have one.
Howto compile CMEM:
- Untar the linuxutils package. The place where to untar them is not important.
- Go into the CMEM subfolder. For the 2.24 release it’s the ./packages/ti/sdo/linuxutils/cmem/ folder.
- Take a look at the Rules.make file. Messy, ain’t it? Remove the write protection.. chmod +w Rules.make will do that. You now have to adjust the pathes in that file or if you’re like me – delete it and write it from scratch:Here is my copy with everything not needed removed:
# path to your toolchain. Yes, you need to set it twice (don't ask...)
MVTOOL_PREFIX=/opt/CodeSourcery/bin/arm-none-linux-gnueabi-
UCTOOL_PREFIX=/opt/CodeSourcery/bin/arm-none-linux-gnueabi-
# path to the kernel-sources:
LINUXKERNEL_INSTALL_DIR=${OE_HOME}/tmp/staging/beagleboard-angstrom-linux-gnueabi/kernel
# some config things:
USE_UDEV=1
MAX_POOLS=128
- That’s it.. If all pathes are correct “make release” should build the kernel module and some test applications.
Howto test CMEM:
- Copy the kernel-module to the beagleboard. For the test I’ve just copied it into /home/root/. You’ll find the kernel-module at ./src/module/cmemk.ko
- On the board, check your U-Boot boot-parameters. Since CMEM manages physical memory you have to restrict the amount of memory managed by linux. To put aside some memory add the mem=80M directive to the bootargs. You can of course use a different setting if you want to, but the following examples assume 80M for the linux-kernel and the rest for DSP and CMEM.
- Boot the beagle and login as root.
- Load the kernel-module. Let’s keep things simple. We create a single 16mb memory pool. To do so load the module like this:
/sbin/insmod cmemk.ko pools=1x1000000 phys_start=0x85000000 phys_end=0x86000000
If everything worked as expected you’ll find the following line in the kernel-log (type dmesg to get it):
cmem initialized 1 pools between 0x85000000 and 0x86000000
If not – well – CMEM will give you a bunch of hints in the kernel-log if it had problems during initialization. Most likely you’ve got the addresses wrong. As the start-address you should pass 0×80000000 plus the size you’ve specified in the u-boot bootargs. Add the sizes of all of your CMEM-pools and use this as the end address.
- While the module is loaded you’ll find a file under /proc/cmem with some statistics.
- If everything worked so far you can run some of the demo-applications like apitest. They’re are located in the ./apps/apitest/ folders.
Compile an ARM program that uses CMEM:
This is easy. Copy ./src/interface/cmem.h to a place where the cross-compiler will find it and add one of the cmem.a libraries to your project. Since I like to keep things simple I’ve just added the interface source to my project. It’s ./src/interface/cmem.c.
Now you can allocate contiguous memory and get the physical address of it. Big deal, eh? Honestly, like I said CMEM only makes sense if you want to make use of the C64x+ DSP or the SDMA of the OMAP3.
Comments (6)
实际上cmem已经有文档说怎么编译了,不过rules.make有点怪,还要指定uctool_prefix!其实上文已经说了,就是gcc前缀,就是arm_v5t_le-或者arm-none-linux-gnueabi-什么的。
我这里重编译的是1.3版本,他的src/interface/Makefile里面用到了uctool_prefix,而且接的gcc是用斜杠/,所以编译的时候会说arm_v5t_le-/gcc找不到。修改makefile就行了。
-pools created with CMEM: pools=2×4001792,10×4096 phys_start=0×9ba00000 phys_end=0×9c1ac000 (the 2 big ones are for images)
On the GPP side:
-initialization of CMEM with CMEM_init() and dsp_mmu_map_cmem () (the function you provide for the MMU problems)
-images allocated via CMEM_alloc (which automatically choose the big pools)
-physical addresses of images are get with CMEM_getPhys and sent to the DSP with MSGQ_put
-wait answer from the DSP
On the DSP side:
-receive addresses with MSGQ_get
-do the grayscale conversion from the color image buffer to the grayscale buffer
-send a done message to GPP with MSGQ_put The problem is that the conversion is very very long. So I have removed the conversion and just put a basic copy (all red data from color image is copied to the grayscale image), and the problem is always the same. The same copy on the arm is about 40 times faster… Do you have any idea of what I am doing wrong?
Thanks for your help. Comment by Guillaume — February 4, 2010 @ 11:31 pm
Nils Comment by Nils — February 5, 2010 @ 12:56 am
First, I have to clarify my last comment. I realized that the copy on the ARM was faster because it was not working on a buffer allocated with CMEM. If I made a copy on a buffer allocated with CMEM, DSP is a little faster.
So I think the problem comes from CMEM, and I started to work only on the GPP side. So the results are the followings (byte access):
-100MByte write on a CMEM buffer takes 15 seconds and 204194 microseconds (6.577 MByte/s)
-100MByte write on a classical buffer takes 1 seconds and 910461 microseconds (52,343 MByte/s)
Using CMEM buffer is very slow… Now I have to discover how to improve this… Regards,
Guillaume Comment by Guillaume — February 5, 2010 @ 12:39 pm
Nils Comment by Nils — February 5, 2010 @ 6:27 pm
Guillaume Comment by Guillaume — March 1, 2010 @ 11:17 am