New Blog Title
Saturday October 06th 2007, 1:00 pm
Filed under:
General
You’ll notice I changed the blog title to OpenOffice.org: Calc and R… Only because Summer of Code has now been over for over a month, and luckily this package is still being developed further. I haven’t done much since early September mainly because of a big move from Toronto, Canada to New York, USA for an internship. I’ll be heading out to France for a week next week, but aside from that hope to get some more good coding done, to release a version 1.0.
The good news, however, is that the package has already been tested on NeoOffice (Mac), as well as Linux and Windows verisons of OpenOffice, and I’m glad to say they all seem to work!
The Future
Monday August 27th 2007, 1:36 pm
Filed under:
General,
To Do
For those following the Summer of Code blogs, websites, or discussion boards, it’s pretty clear that the official coding period is now over. In light of this, I released 0.1.6 and since then, 0.1.7 to fix a few bugs. I must say the program has been a great experience, and I’m glad to have learned so much about OpenOffice, R, and software development in general. I am also grateful for the many e-mails I’ve received from people who have tried the add-on and are happy with how it works.
So, what does the future hold? I plan to continue developing the add-on… The next steps deal with fixing bugs rather than adding features, and formally released a version 1.0. Then, I’ll work on documenting and promoting the tool, and finally, begin adding more features. If more and more people keep using this tool then I’ll gladly keep updating it, and will do my best to keep the process as transparent as possible.
For anyone interested in getting involved, please contact me.
0.1.6
0.1.6, but really, a very close version to 1.0. I was looking at the aims of this package from the beginning of the summer and it seems it does everything I wanted it to do. Sure, it’s not perfect and requires some testing and improvements, but in terms of a Calc add-on that provides R functionality, everything is there.
So this release will be the last or next-to-last (if I release one more in the morning!) release during the official Summer of Code period… I’ll be writing a reflective post soon, and maybe even write a bit about the future soon.
Until then, download the package, source code, and manual.
Image Embedding Now Supported
Finally, version 0.1.5 of the add-on is out… You can download the UNO package or the source code. The manual has been updated as well, and discusses how to send R output into the spreadsheet.
Embedding Images in Spreadsheets
After a few good hours of trying to figure it out, I finally got it working… It’s a conglomeration of advice, mainly based on two sites… The first deals with linking to images from a spreadsheet, while the second focuses on embedding images in text documents. After a while, I got everything to work as I hoped, and for those interested, I wrote a wiki section on this: Calc Api Programming - Graphics.
This brings the R/Calc integration project to a nearly complete initial version, with graphics output from R going directly into Calc. The really neat thing about embedding images is that it allows one to use the R/Calc add-on, make some plots and output data, and send the file to another person without having them use RServe or R to actually see the information!
Really Pretty Pictures
After reworking some of the code, it turns out making graphics output work wasn’t as difficult as I feared. The graphics tool is still buggy and doesn’t refresh properly, but I’ll be fixing this within the next few days (I hope!). What I did want to do now, however, is share some code and pretty pictures.
First, I’m a big fan of social network analysis and installed the sna package for R. I opened Calc and the R coding window, and typed the following:
library(sna)
rgraph(10)
{$OUT#A1:J10}<-{$BASE}
gplot({$A1:J10})
The result is below:
Another interesting package is scatterplot3d, which does pretty much what the name suggests. Here’s some more code:
rnorm(25)
{$OUT#A1:A25}<-{$BASE}
rnorm(25)
{$OUT#B1:B25}<-{$BASE}
rnorm(25)
{$OUT#C1:C25}<-{$BASE}
library(scatterplot3d)
scatterplot3d({$A1:A25}, {$B1:B25}, {$C1:C25}, pch="x",
color="red", xlab="A", ylab="B", zlab="C")
And another pretty result:
Matrix Output
A new version is out (0.1.4), this one including matrix output. I am now updating the manual with every release as well. As always, the package, source code, and manual are all available at http://wiki.services.openoffice.org/wiki/R_and_Calc.
Please let me know what you think!
Function Calls, For Loops
Thursday August 09th 2007, 3:01 pm
Filed under:
To Do
Right now you can hack your way into using function calls within Rserve by putting everything on one line. In other words, sending multiple lines of scripts through Rserve with the hope of getting R to run a for loop or create a custom function is impossible. But if you put all the code on one line, it works. For example, sending each line below as a separate argument works quite well:
randNumX <- function(x) {z <-rnorm(1); return(x*z)}
randNumX(3)
Due to syntax issues, this doesn’t work in the add-on yet because one can’t properly use the { and } braces. I’m working on it, though, and it should be available soon.
A Manual
Aside from debugging and testing, the last few days have been spent writing a manual for the add-on. While a wiki exists, I’m beginning to prefer the existence of a basic ODT or PDF file that can be downloaded and used at any time. It’s a bit cleaner than a wiki, though isn’t necessary updated as regularly.
So with this in mind, I put together a basic manual. It’s main attraction right now is the documentation it contains about the Scripting Window, which was added in version 0.1.3 and which is currently not documented on the wiki itself.
The manual is available in PDF… No ODT due to WordPress security restrictions! I welcome any and all suggestions.
New Version and Architecture
While I was away, I really got excited about the idea of implementing a basic console interface for R through Calc. The idea here is that you can start R (and Rserve) in Calc and then you get a window where you would write R scripts like you would in the console. The only difference is that in addition to writing these scripts, you can send outputs directly to the spreadsheet you’re working with. For example, if you have a two columns of data (say, with 10 rows each) and want to find some information, you could type:
cor.test({$A1:A10}, {$B1:B10})
{$OUT#C1}<-p.value
{$OUT#C2}<-conf.int[0]
{$OUT#C3}<-conf.int[1]
This will run a correlation for those two columns and output the P value and confidence interval (at 95%) to the third column. Yes, the code isn’t as clean as it could be but it works, and I’m thrilled. The only thing I need to fix is conditional statements and for loops as they don’t work properly due to their multi-line nature.
On the unseen side, I modified the architecture of the add-on so that all the tools use one basic class that communicates with R and scans its object structure to get proper outputs. This will make life easy if I move the package to JRI rather than continue using Rserve.
So what’s next? Testing, user interface improvements, and an installer! Yes, the hard-but-fun part is over as this thing works to an extent where I know I’ll be using it, and I hope others will too.