GEPS 013: Gramps Webapp: Initial Thoughts
This was an initial discussion page about a data server. This information has been superseded by GEPS 013: Gramps Webapp.
Proposition of a Gramps Web Application. Now that the main graphical user interface (GUI) has been separated from the command-line interface (CLI), a web application would be the next logical step.
Contents
Motivation
Web developers are in need of a method of accessing and creating functionality with their Gramps data on the web. Having a Gramps webapp would allow a Gramps-based web project.
Here is a small list of goals:
- Create a fullscale Gramps web framework
- Allow multiple users via the standard web browser
- Build on Gramps codebase and wealth of resources
- Use standards and well-known, well-tested frameworks where possible
- Consider the WSGI protocol
- Consider Django, ZOPE, and other frameworks
- Consider the data access
Example GMS Web Sites
Genealogy Management Systems on the web:
- http://www.dertinger.de/Dertinger_database/en/en_index.htm
- http://registry.phpgedview.net/index.php for example: [1].
- Note here: the intro page is a collection of gadgets/controls, which then link into the real data.
Discussion
Many of these comments were directed at an earlier prototype on a Gramps Server Mode. This GEP has been updated to reflect the larger goal of building a fullscale web application.
Richard Taylor noted:
- I think that you need to think very carefully about the use of the Threads in the server model. I don't think that database backend if thread safe and strange things might happen. (I have not looked at the database for about 3 years, so things might have changed). You might think about a async server approach using something like twisted.
- The use of eval to execute the client instructions is very unsafe from a security point of very. If you are going to use this model you should restrict connections to those from localhost or use only unix domain sockets or think about strong authentication.
Brian Matherly noted:
- I think the RPC strategy should be XML based. In particular, it makes sense to me for the Gramps "server" to be Web Services based. We should look at something standard like SOAP. That will make it most friendly for clients to access and opens a whole world of unimagined possibilities in the future for mashups.
- If the Gramps project is eventually expanded to include a "web server"/ "web app"/"web services" aspect, I strongly believe that it should be done as a separate application. In fact, the end goal should probably be for a multi-part repository of code that includes: A GTK based desktop application, a CLI only application with much fewer dependencies, a web server application that provides a web service, a web server application that serves actual web pages (real time NavWeb), and one or more core libraries to support these applications in a well thought-out, abstract and maintainable manor.
- In order to archive the goals I listed in #2, we first need agreement from the developers that we want to go that direction. If we achieve that, we need an architectural plan that can bring us there. If not, we should branch this idea off as a separate project on SF.
Benny Malengier noted:
- BSDDB has a multiuser flag (which is not switched on), to allow for things like this.
- Goal 1 should be a definition of how a server can work on bsddb using present src/gen, how request from a client can be done, and how replies should be structured (I would hope our own xml schema). Then some framework to do the client itself, so as not to reinvent the wheel.
- Goal 2 a barebones client: a listview of all primary objects (shown per 25/50/100 entries ), and html editor for the main info of the primary objects.
- Thoughts about breaking Gramps into parts
- For a server part I would restrict to a linux server strictly now.
TODO
- Develop a SQL Export for Gramps. DONE. See Gramps SQL Database
- Gramps currently needs a display. I had to make some changes in Gramps that I haven't committed yet that allows Gramps to run without having X or a display. But you won't notice this issue when you run the code if you have a display---the issues only shows up when trying to run without one.
- Explore Web frameworks. In progess, below.
Web App Architecture
There were many good ideas on the talk page and in the mailing list about the functionality of the webapp. A couple of them involved the security of the site.
An idea that emerged is to allow access to non-logged in users, but only show the data via a private proxy. That way, a visitor (and google) can see things like "Living Smith". However, as you log in, you gain ability to see detail, and edit the data.
Django
A prototype of a Gramps Django webapp is in branches/geps/gep-013-server. To run it, do the following:
- Download Django. I'm running version 1.0.3
- Checkout the geps/gep-013-server branch from Git
- cd webapp/grampsweb
- Edit the path to the Sqlite DB and the last 2 lines in settings.py to point to your source of Gramps, and a database name
- make
- make run
- Point your webbrowser to:
Concurrent access problems
Concurrent access for write and read imply several problems when people by accident change the same objects at the same time. Gramps itself has an elaborate signal handling for cases when dialogs are open with no longer current information. In a web environment, this becomes more difficult however.
A possibility to work with for concurrent access:
- lock table with handles
- timestamp of last change --> already present.
The working method for support of concurrent access is then:
- If concurrent write possible: When doing a change/delete, one needs to obtain a lock on the handle in the lock table, otherwise fail and view must be updated
- If concurrent write possible: When doing an add, one needs a lock during creation of the handle and grampsid to ensure uniqueness. This lock need not have a long duration, and should be released after obtaining the new ID's if it is clear no conflict can happen anymore
- always needed: While changing an object, one needs to pass a timestamp of when the data was originally read. If the timestamp of the object itself is more recent, a fail results and the web app view must be updated
Furthermore, the following must be taken into account:
- For safety, opposed to what Gramps does, only changesets are done, not write of the entire object. So one does not save a person, one saves a changed name of a person. The way Gramps changes everything everytime one clicks on safe was not a good decision way back, but understandable in a single user environment
- The client must be able to work sensible with a fail on save. In essence, in Gramps speak, this means a person-update, family-update, .... signal has happened since the view was constructed, and the view must update itself like the Gramps views/editors do on receive of such a signal. Javascript magic can do this easily ??
There seem to be two modes for concurrent edits on the web: something like what this wiki uses, and something more complicated using AJAX. I think we should make this as simple as possible for the following reasons:
- we're talking about sites that will have few simultaneous multiple users
- most genealogy data use is reads; edits are rare compared to reading and adding
- we can make it ore sophisticated later, if we wish
However, I'm not suggesting that we don't handle this properly, but we can probably get by with what the wiki does (and is largely what you describe above):
- timestamp proposed edit item
- begin edit
- if others attempt to edit the same item in a given timeframe, prevent (or warn)
- if you attempt to save data that has changed since you began your edit, show them the two versions (your currently edited version, and the new version, and a diff of them) and let the person editing decide to either 1) overwrite, 2) re-edit, or 3) abandon.
- when you save, update the timestamp
This is not built into Django: