Using the Python Upload and Download Tools with a Java application
There isn't a Java data export tool yet. However, with a little tweaking, it is possible to use the Python App Engine tools with a Java application. The steps to doing this are:
1. Install the Python SDK
2. Create a minimal version of your application in Python that defines models mapped to your Java classes
3. Create a exporter configuration in Python
4. Use the bulk download tool
Understanding application versions
Application versions are simply strings. It's possible to run several versions of the application at once. Suppose you have two applications, a Java version and a Python version.
Let's suppose your Java application's appengine-web.xml is defined as follows:
You would be able to access this application at http://java.latest.appid.appspot.com.
Now let's suppose you have a Python application with the following app.yaml file:
application: appid
version: python
runtime: python
api_version: 1
handlers:
- url: /
script: main.py
You would be able to access this application at http://python.latest.appid.appspot.com.
All application versions will have access to the same datastore and memcached partitions. What's really great about this setup is that you can have multiple versions deployed, but set only one of the versions as the default to be served, but still have the other versions accessible. This can be done in the administrative console under Administration -> Versions. This technique can also be used to stage new versions of the application against live data (after ample testing, of course) before a full rollout. In the event of new bugs introduced into production, rollback is just as simple. There's more about this topic here: http://googleappengine.blogspot.com/2009/06/10-things-you-probably-didnt-know-about.html
Creating a Python exporter
The datastore is language agnostic. There's nothing that prevents data created by Java from being accessed by Python. Suppose we have a Java class defined as follows:
import com.google.appengine.api.datastore.Key;
import javax.jdo.annotations.*;
@PersistenceCapable(identityType = IdentityType.APPLICATION)
public class Thing {
@PrimaryKey
@Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Key key;
@Persistent
private String name;
@Persistent
private int number;
public Thing(String name, int number) {
this.name = name;
this.number = number;
}
public Key getKey() {
return key;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public int getNumber() {
return number;
}
public void setNumber(int number) {
this.number = number;
}
}
It would be possible to create a Python application that can access this data. This particular Class's doppleganger is as follows:
class Thing(db.Model):
name = db.StringProperty()
number = db.IntegerProperty()
(If you've been doing nothing but Java and this is your first exposure to Python, be warned! It's addictive and fun.)
The next step is to create an exporter. Per the Python documentation at http://code.google.com/appengine/docs/python/tools/uploadingdata.html:
from google.appengine.ext import db
from google.appengine.tools import bulkloader
class Thing(db.Model):
name = db.StringProperty()
number = db.IntegerProperty()
class ThingExporter(bulkloader.Exporter):
def __init__(self):
bulkloader.Exporter.__init__(self, 'Thing',
[('name', str, None),
('number', str, None)
])
exporters = [ThingExporter]
The exporter is a code-as-configuration file that allows you to specify how to create the output CSV file. You'll also want to update app.yaml to specify the app ID, version and to enable data export. Assuming your Python application is called "exporter":
application: appid
version: python
runtime: python
api_version: 1
handlers:
- url: /remote_api
script: $PYTHON_LIB/google/appengine/ext/remote_api/handler.py
login: admin
When this is done, deploy your application using appcfg.py:
appcfg.py update exporter
Make sure your default version is set to serve from the Java app! Your Python app is purely here to allow you to export data.
Exporting data
You're almost done. Now it's time to export the data. We'll need to manually specify the URL to fetch from in addition to the required exporter arguments. This URL will point to the Python version of the application:
appcfg.py --server=python.latest.appid.appspot.com download_data exporter --filename=data.csv --kind=Thing --config_file=exporter/thing_exporter.py
This should be all you have to do.