Baratine on GitHub

Baratine 0.8

Baratine 0.8.8 - 2015-03-27

@ResourceService removed

With the upgrades to @Service, the old @ResourceService is now redundant and has been removed. Details of @Service changes are below. Also see the store: and database: schemes for persistence.

async/Result patterns more strongly encouraged

To simplify the programming model, documentation examples now encourage used of the async Result api over blocking calls. The base Baratine programming model is now the async/Result model.

Blocking/sync APIs still exist, but are for special case APIs, for example QA and integration with existing blocking code.

Example interface:

public interface MyHello {
  void hello(String arg, Result<Service> result);
}

Example implementation:

@Service("public:///hello")
public class MyHelloImpl implements MyHello {
  public void hello(String arg, Result<Service> result)
  {
    result.complete("Hello[" + arg + "]");
  }
}

Example Sync interface for testing and gateways:

public interface MyHelloSync extends MyHello {
  String hello(String arg);
}

The Sync interfaces should not be used in Baratine services. They exist for QA/jUnit convenience, and for easier integration with blocking Java applications.

Result.from (chaining)

The new from method replaces the older chaining method, used when async calls themselves call async methods. The from properly handles exception chaining as well as result chaining. A typical call looks like:

@OnLoad
private void onLoad(Result<Boolean> result)
{
  _store.get("/key", result.from(v->afterLoad(v));
}

private boolean afterLoad(String v)
{
  ...
}

An alternative form passes the result:

@OnLoad
private void onLoad(Result<Boolean> result)
{
  _store.get("/key", result.from((v,r)->afterLoad(v,r));
}

private void afterLoad(String v, Result<Boolean> result)
{
  ...
}

@Service updates

The base @Service lifecycle has been upgraded to handle most of the capabilities of @ResourceService to simplify the programming model. Instead of deciding which model is appropriate, there is now only a single, more capable model. There are two major changes:

  • Improved @OnLoad/@OnSave lifecycle support
  • Improved @OnLookup child support (built-in LRU)

@OnLoad/@OnSave/@Modify lifecycle

@Service now manages @OnLoad/@OnSave automatically.

@OnLoad is called once before any methods. It is never called again because the service is the sole owner of its data. Normally, the service will use the @OnLoad to load the saved state from a backing store, either one of Baratine’s own stores, or an external store like a traditional database.

The @OnLoad method is normally async with a Result returning a boolean, where the Result is only completed when the load completes. While the load is in progress, pending messages will be queued, to be executed after the load completes.

@OnSave is called after a batch when the service is modified/dirty. Any @Modify method called during a batch will mark the service as modified. If a method without a @Modify is called, the service will not be marked dirty.

Because of the batching, under heavy load, @OnSave will be called less frequently, improving performance.

When a @Service adds a @Journal, the @OnSave is only called when the journal requests a checkpoint, not on every batch. This journal behavior improves the efficiency of saves further.

The combination of the @OnLoad and @OnSave behavior is designed to improve database efficiency. Loads only happen once, at initialization, and saves are batched with efficiency improving under heavy load.

@OnLookup LRU

@OnLookup is used to create REST-style resources. The @OnLookup method is a direct, non-blocking method. It should only return an uninitialized, unloaded instance for a resource. The resource will later load itself in an @OnLoad method.

Child resources created by @OnLookup share the same inbox as the parent service.

For example, a service managing “/auction” will use @OnLookup to create instances for “/auction/1”, “/auction/2”, etc. All child instances will share the same inbox as the original auction.

Because @OnLookup is direct, it bypasses the inbox, which means it should only be used to create a Java instance, without business logic. It’s essentially a newInstance call.

The children are saved in a LRU to help manage synchronization issues for instances. An application can assume that the instance is the unique active instance, e.g. there will never be two simultaneous instances “/auction/1”.

@Journal

Journaling now works properly with the @Service @OnLookup pattern for resources as described above. Journaling defers @OnSave calls (database writes) until a user-defined limit is reached, or the journal segment rolls over.

  • With journal, the @OnSave is called only when those thresholds are crossed.
  • Without journal, the @OnSave is called on every after batch.

The @Journal annotation now has max-count and timeout values.

@SessionService replaces channels

@SessionService is the updated name for the channel service, changed to better reflect its purpose.

A @SessionService creates a new instance for each web session, either websocket or HTTP push/pull channel. The instance serves as a facade to the rest of Baratine, and is local to the JVM where the TCP connection is located.

The @SessionService URL is always a single path segment with a session id generated by Baratine.

@SessionService("session://my-session-pod/my-session/{_id}")
public class MySessionServiceImpl implements MySessionService {
  private String _id;
  ...
}

Store/store: service

A simple key/value store is now available at the “store:” URL and at io.baratine.store. The store has both clustered (pod) and local support. The local version is at “store://local”, and the current-pod is at “store:///”.

To match service ownership partitioning, clustered store keys are hashed the same as pod URLs. So in a multi-server deployment, “store:///my-service/13” and “pod://my-pod/my-service/13” will be owned by the same JVM/node. This means gets and puts will be local database access, with replication to backup nodes.

Typically, the load will happen in an @OnLoad and is only needed once because of Baratine’s single-owner policy.

@Inject @Lookup("store:///my-service")
Store<String> _store;
...
@OnLoad
private void onLoad(Result<Boolean> result)
{
  _store.get("/key", result.from(v->{
     _value = v;
     return v;
   });
}

@OnSave
private void onSave(Result<Boolean> result)
{
  _store.put("/key", _value);
  result.complete(true);
}
...

db: DatabaseService

The “db:” scheme, DatabaseService, is a more complete database interface than the simpler “store:” service.

The database is a sorted key/value store, partitioned when deployed in a cluster/pod. The key/value structure implies by design that keys are unique. If a second row is added with the same key, it replaces the first, like a put() for a map, unlike an insert in a relational database.

While queries are allowed on tables, use SQL syntax, and multiple tables are supported, joins are not supported. The primary use of the database is for persistence with Baratine’s @OnLoad and @OnStore. It is not intended as a general relational database.

Multipod Deployment

A single *.bar file can now deploy to multiple pods at once, using URL authority in the @Service addresses, and config defined in META-INF.

When a @Service URL has an authority section like pod://my-pod/my-service it is only deployed in the my-pod pod. This allows the .bar file or .jar to contain all services for all pods and selectively deploy them.

The @Service looks like the following:

@Service("pod://auction/auction-manager")
public class AuctionManagerImpl implements AuctionManager {
  ...
}

The @SessionService looks like the following:

@SessionService("session://session/auction/{_id}")
public class AuctionSessionImpl implements AuctionSession {
  ...
}

The pod configuration belongs in META-INF/baratine/config/pods/my-service.cf. All files in META-INF/baratine/config are copied to the equivalent path in BFS, bfs:///config/pods/my-service.cf. The configuration look like:

pod auction {
  archive "/usr/lib/pods/my-app.bar";
}

pod user {
  archive "/usr/lib/pods/my-app.bar";
}

pod session {
  archive "/usr/lib/pods/my-app.bar";
}

bfs:///config/pods ordering

The /config/pods files are now sorted before being parsed. This sorting allows for a simple convention to ensure config ordering. Since “00-xxx” will be configured before “10-xxx”, applications can name their config files to entire an expected order.

.bar configuration: META-INF/baratine/config

Deployment will copy files in META-INF/baratine/config to their BFS equivalent. To configure a pod, META-INF/baratine/config/pods/10-mypod.cf will be copied into bfs:///config/pods/10-mypod.cf.

cross-pod lambda support

Lambda arguments to foreign pods are now supported transparently. Serialized lambda expressions require that the class containing the lambda be loaded on the target service. This means that the call to the target pod must deserialize the lambda using a custom classloader that contains the callers classes. Because pods are deployed with BFS, which is shared across all nodes, the classes are already available to all JVMs.

This behavior should be invisible. If an application calls to a foreign pod with a lambda expression, the expression should simply execute on the foreign service.

The API might look like the following:

import io.baratine.stream.*;

public interface MyApi {
  void doOperation(RunnableSync run);
}

Note: Baratine’s RunnableSync is used as a convenience, because distributed lambda serialization requires that the lambda implement Serializable.

The caller might look like:

@Service("pod://pod-a/service-a")
public class AImpl {
  @Inject @Lookup("pod://pod-b/service-b")
  private MyApi _api;
  ...

  _api->doOperation(()-> { System.out.println("Hello"); });
}

The operation will occur in the target pod. So the println will be saved in the target pod’s log files.

event: is distributed

The “event:” scheme is now distributed. If the URL includes a pod as the authority, subscriptions and events will be directed to the pod. The owning now is partitioned based on the URL, as with normal foreign pod calls.

Publishing to an event looks like the following:

@Inject @Lookup("event://my-pod/my-event")
MyApi _api;
...
_api.myMethod("my-arg");

As before, the event service uses Baratine standard methods. There is no separate event API.

To subscribe, either use the @Subscribe annotation on the listening service, or use the ServiceRef.subscribe method:

@Subscribe("event://my-pod/my-event")
@Service("/my-service")
@Startup
public class MyService {
  void myMethod(String myArg)
  {
  }
}

Auction example on github

There is now an example multi-pod auction example on github at https://github.com/baratine/auction. To check it out use:

$ git clone https://github.com/baratine/auction

Some simpler examples are also available at https://github.com/baratine.

Bugfix list:

  • REST: enable PUT, DELETE following REST models
  • resource: @BeforeDelete issues (rep by Riccardo Cohen)

Baratine 0.8.7 - 2015-01-07

Pod Partitioning

Pod nodes can now be directly accessed by two methods: a pod node URL, and the node and getNodeCount methods in ServiceRef.

At runtime, application code typically does not know how many nodes are configured for a pod. The getNodeCount returns that number.

The “pod:” scheme selects the node based on the URL hash by default to support Baratine’s single-owner policy.

In cases where a copy of service runs on all nodes, like the ResourceManager for @ResourceService, the instance can be selected with the node method of the ServiceRef or with the “pod://pod:2/my-service” syntax.

Deployment

The deployment locations have been updated to use BFS more effectively.

$ bin/baratine deploy examples/hello.jar
deployed hello.jar to bfs:///config/pods/pod.cf

The directory bfs:///config/pods is the directory for the “pods” service, which is responsible for deploying pods and services.

By convention, any service will use bfs:///config/my-service as a directory for its config files, including application servers. Since BFS is distributed, this configuration is automatically available to all servers in the cluster. This configuration model is similar to the unix inetd configuration.

For deployment, the deployed service is placed in /usr/lib/my-pod.bar. BFS makes that file available for all servers.

The config file pod.cf points to the my-pod.bar file.

$ bin/baratine cat /config/pods/pod.cf
# created by deploy

pod pod {
  application "bfs:///usr/lib/pod.bar";
}

Note that the config syntax is the same as in the baratine.cf file. This allows for dynamic configuration of pods.

As before, cat /proc/services will show the deployed services:

$ bin/baratine cat /proc/services
[
{ "pod" : "pod",
  "services" : [
  { "service" : "public:///hello",
    "queue-size" : "0"
  }]
}]

BFS Directory Conventions:

/proc/                  -- active system statistics
/config/<servicename>/  -- configuration for a service
/system/                -- reserved for Baratine system files
/usr/lib/               -- jar and bar files

Partial Change List

  • pod: internal revamp of pod deployment
  • heartbeat: internal revamp of heartbeat
  • deploy: use /config/pods/*.cf for all pod creation/deploy
  • deploy: allow multiple jars in a deploy command (rep by Riccardo Cohen)
  • ServiceManager: add ServiceManager.getPodNode()
  • http rest: add support for GET/PUT with get()/put() (rep by Riccardo Cohen)
  • http rest: create() was not returning proper serviceRef (rep by Riccardo Cohen)
  • resource: @BeforeCreate is implicitly a @Modify method (rep by Riccardo Cohen)

Baratine 0.8.6 - 2014-11-25

JDK-8

Baratine now requires JDK-8 because of the support for lambda expressions and improved Java interface capabilities.

maven deployment

The Baratine API and implementation are now available as a maven repository on baratine.io/m2. A pom fragment might look like:

...
<repositories>
  <repository>
    <id>baratine.io</id>
    <name>Baratine Repository</name>
    <url>http://baratine.io/m2</url>
  </repository>
</repositories>

<dependencies>
  <dependency>
    <groupId>io.baratine</groupId>
    <artifactId>baratine-api</artifactId>
    <version>0.8.6</version>
  </dependency>

  <dependency>
    <groupId>io.baratine</groupId>
    <artifactId>baratine</artifactId>
    <version>0.8.6</version>
  </dependency>
</dependencies>

io.baratine.core.Result

Result is now a JDK-8 FunctionalInterface with the complete() method as the lambda expression. A method definition with an async callback will look like:

void myMethod(String myArg, Result<String> myResult);

And a corresponding lambda call will look like:

myService.myMethod("myArg", x->System.out.println("result: " + x));

Note: as always, the result callback occurs in the caller’s thread if the caller is a Baratine service. If the caller is not a service, Baratine will spawn a thread for the callback.

complete/fail

The main result methods complete and fail have new names.

Result chaining

To simplify intermediate calls, where a serviceA calls serviceB and needs to chain the result from serviceB and do post-processing before delivering it’s own result, the Result interface now includes two chain() method.

An example method for serviceA might look like the following, where myMethod calls the serviceB method leafMethodB. When leafMethodB returns, the callback will call postProcess for any postProcessing and then return the result to the original caller.

The chain automatically propagates exceptions. If leafMethodB throws an exception, the chained Result will send the exception to the original caller.

As always, the callback runs in the ServiceAImpl thread.:

class ServiceAImpl
{
  @Inject @Lookup("/serviceB")
  private ServiceB _serviceB;

  public void myMethod(String arg, Result<String> result)
  {
    _serviceB.leafMethodB(result.chain(x->postProcess(arg, x)));
  }

  private String postProcess(String resultB, String arg)
  {
    ...
  }
}

Result.make()

Result.make is a convenience method for using lambdas for both the complete() and the fail() methods of a result.

Result.fork()

For clients that call multiple leaf services and need to wait for all to complete before post-processing, the Result.fork method uses JDK-8 stream patterns to compbind the results. Result.fork is experimental and is likely to change.

io.baratine.core.ResultFuture

ResultFuture is a renaming of the previous future. Futures are discouraged in general because the block the caller and impose a threading performance penalty when the JDK/OS wakes the blocked thread.

Stream/lambda interfaces (following JDK-8)

Services can create streamable methods using the io.baratine.stream.StreamBuilder and ResultSink arguments. The caller can build custom queries using the StreamBuilder and caller lambda expressions.

The client API and service implementation will differ. The client API uses a ServiceBuidler and the service implementation has a corresponding ResultSink.

The service impl calls the ResultSink with all the objects in the stream using accept() and finishes by calling end(). The call to end() is required to finish the stream. An implementation might look like the following:

void myStream(String myArg, ResultSink<String> result)
{
  ArrayList<String> list = getList(myArg);

  for (String value : list) {
    result.accept(value);
  }

  result.end();
}

The client can process the stream as it sees fit. Lambda expressions in the stream execute in the target method’s thread (not the caller’s thread.)

The following executes the println lambda for each stream value, in the MyServiceImpl's service thread:

myService.myStream("arg1")
         .forEach(x->System.out.println("arg: " + x));

The following selects strings that start with “myprefix-” and then concatenates them. The filter and the concatentation run in the MyServiceImpl thread, while the result println runs in the caller’s thread, because it’s a normal Result.

myService.myStream("arg1")
         .filter(x->x.startsWith("myprefix-"))
         .reduce((x,y)->x + "::" + y,
                 x->System.out.println("Result: " + x));

Stream calls across a pod

Stream calls across a pod become map/reduce calls if the pod has multiple nodes. A “pair” pod will run the stream call on both node-0 and node-1 and then combine the results before returning to the caller. The calling code is identical to the non-pod call:

myService = manager.lookup("pod://my-pod/my-service");
myService.myStream("arg1")
         .filter(x->x.startsWith("my-prefix")
         .reduce((x,y)->x + "::" + y,
                 x->System.out.println("Result: " + x));

Note that the lambda must be accessible to the remote service, because the JDK’s lambda object does not include the code itself, but merely a pointer to the code. To use new the lambda code in the service, the caller can point to a jar file:

myService.myStream("arg1")
         .jar("file:///home/user/myjar.jar")
         .filter(...)
         .reduce(...)

Baratine will deploy myjar.jar across the cluster by publishing it to BFS, and will use it when the service executes the stream code.

io.baratine.stream

The interfaces in io.baratine.stream duplicate equivalent interfaces in JDK-8 java.util.function with two important differences:

* They implement ``Serializable`` which is required to serialize lambdas for remote calls.
* They include both ``-Sync`` and ``-Async`` versions of the interface.

The -Async versions are important because a blocking lambda expression will block the entire service thread, potentially freezing all requests. If a stream lambda needs to wait for a remote result, the -Async allows it to release the thread.

Deployment Updates

Note: deployment work is in progress. We expect the next release to contain significant changes to service/pod deployment.

Deployment in Baratine is to the internal distributed filesystem. You can debug deployment by using the CLI ls and get methods in the /system/deploy tree.

The new deployment uses a simplified model, where a new pod replaces the previous one. The canonical name for a pod deployment is my-pod.bar. For example /system/deploy/pods/my-pod.bar is the deployment for my-pod. If the CLI deploys a simple jar or deploys a directory, Baratine will automatically create a .bar file.

If you want to create a full .bar, it’s just a renamed .jar file with the following structure:

lib/*.jar    # internal jar files
classes/*    # loose .class files

CLI deploy changes:

An undeploy command has been added.

The deploy command will now automatically create a .bar file out of .jars from the command line.

Kelp/Kraken performance and memory work

The underlying database kelp/kraken has undergone significant performance work to better handle large amounts of data without requiring it all to stay in memory. BTree leaf pages are swapped out using a LRU to save memory.

jUnit updates

The Baratine jUnit support has been upgraded to more easily deploy single services. The services attribute of @ConfigurationBaratine lists the services that should be deployed for the test. The service can be a plain @Service, a @ResourceService or a @ChannelService.

A new Baratine server will start and deploy the service. The server has all the normal Baratine capabilities including the embedded database and BFS.

An example test might look like the following:

@RunWith(RunnerBaratine.class)
@ConfigurationBaratine(services=MyServiceBean.class)
public class Test
{
  @Lookup("/my-service/1")
  MyService _service1;

  @Inject
  ServiceManager _manager;

  @org.junit.Test
  public void test()
  {
  }
}

Embedded Server updates

The embedded server API is still in progress (and therefore in com.caucho.baratine).

An embedded server can now join a Baratine cluster as a more powerful client, including the ability to access the Baratine FileSystem.

Creating a server looks like:

ServerBaratine server;

server = Baratine.newServer()
                 .port(8086)
                 .seed("192.168.1.10", 8085)
                 .seed("192.168.1.11", 8085)
                 .build();

ServerManager manager = server.client();

...

server.close();

A remote proxy would normally use a pod scheme:

MyService service = manager.lookup("pod://my-pod/my-service")
                           .as(MyService.class);

BFS would look like:

FileService root = manager.lookup("bfs://cluster")
                          .as(FileService.class);

Partial Change List

  • hamp: change to method cache for send/query
  • stream: add map/reduce stream prototype
  • network: bind before database to avoid multiple jvms with same database (#5136)

Baratine 0.8.5 - 2014-09-01

Lifecycle Updates:

The lifecycle annotations have been renamed for clarity. There are now three main lifecycle methods: @OnInit called before everything else, @OnActive called after the replay, and @OnDestroy called on graceful shutdown.

@PostConstruct // outside serviceRef context
@OnInit        // in serviceRef context, outside batch
[replay]       // replay methods for @Journal (in batch)
@OnActive      // in serviceRef context, in batch
[methods]      // normal method calls
@OnDestroy

@Workers:

Multiple worker threads are enabled with the @Workers attribute on a service, used for blocking gateways like JPA/Hibernate services or REST clients. The workers all poll the same inbox. Baratine tries to minimize the worker threads woken up, and prioritizes the first workers to improve CPU caching efficiency. So a @Workers(100) might only have three workers ever active, but has the capacity to expand to 100 workers.

The changes in 0.8.5 include wake-minimization. Previously, all workers were worken up. Now, only the workers needed to process the messages are woken. Wake-minimization improves CPU caching and reduces resource requirements like database connections or REST sockets.

Also in 0.8.5, the @OnInit and @OnActive are now lazy. Lazy-init minimizes the resources allocated for large numbers of workers when few are used. In constrast, since @OnDestroy is always called on all workers, it’s possible for an @OnDestroy to be called on a worker that has never been initialized.

pod-map scheme for broadcast and map/reduce

In a distributed environment, the pod-map: scheme will broadcast a copy of each method call to each pod-node. If the method call passes a ServiceRef as an argument, map services can send their results to a reduce service.

Since the result of a broadcast query isn’t sensible, the query will return null. The query can be used as an exception gatherer, since the result will only return when all broadcast nodes have received their messages.

Service code and calling code is unchanged. Both continue to use plain Java methods. The only difference is the proxy lookup:

@Inject @Lookup("pod-map://my-pod/my-service")
MyMapService _map;

Or:

ServiceManager manager = ...;
MyMapService map = manager.lookup("pod-map://my-pod/my-service")
                          .as(MyMapService.class);

The pod-map was added primarily for internal use to support a map/reduce real-time query on the internal database that’s in development.

Kraken database updates:

The underlying Kraken/Kelp database is becoming more of a complete database. Although it’s primarily intended to support @ResourceService and internal databases like BFS and deployment, it’s implemented as a key/value store accessed with semi-standard SQL.

The database is accessible to services with the “bardb:///” URL.

Since the database is primarily used internally, additional SQL support is a relatively low priority, but this could change if developers need the capability.

WHERE expressions:

The expression syntax has been expanded to support arithmetic and logical operations. The current syntax supports:

  • arithmetic: +, -, *, /, %
  • comparison: =, <>, <, >, <=, >=
  • logical: AND, OR
  • object path: .
  • BETWEEN
  • fun(): startsWith(), isShard(), isShardLocal()

distributed SELECT (map/reduce)

For SELECT queries that require scanning of each node, Kraken can now issue an internal map/reduce query.

Kraken queries the owning node when the queries specifies the key, making get-style queries fast. If the calling node is the data-owning node, the select will be a local load and often cached in memory. (@ResourceService guarantees that its own data is on the local node.)

But more general queries on other fields require scanning the entire database. Since the database is distributed, multiple nodes need to be queried. Kraken therefore creates a map/reduce query to select data from each node and combine them.

UPDATE

The UPDATE query has been added to Kraken. In a distributed deployment, the UPDATE will be mapped to all owning nodes. Because of replication, only the active owning node for a key will execute the query, ensuring that a SET v=v+1 will be executed once.

MAP

An experimental MAP query and map() call is used for local callback-based selects. MAP acts like a local SELECT that sends matching rows to a method instead of a traditional result cursor. The map() call requires a MethodRef MAP will take the results for each matching row and pass them as arguments to the method. The MAP syntax is the same as SELECT:

MAP id,value FROM my_table WHERE value > 10

Where the corresponding method might be:

my_method(long id, long value)

Transactions

Baratine discourages transactions in general, but they are sometimes still required when coordinating updates to multiple services. (Although even in that case a coordination service may be a better solution.)

For those cases where transactions are required, Baratine supports a transaction service. There are two steps:

  • Obtain a XA-enabled proxy to the service
  • Use the TransactionManager to begin/commit the transaction

To enlist a service as an XA-enabled result, look it up with the xa: scheme:

@Inject @Lookup("xa:///my-service")
MyService _service;

Or:

ServiceManager manager = ...;
MyService service = manager.lookup("xa:///my-service")
                           .as(MyService.class);

Outside of a transaction, the proxy will act as a normal service.

To obtain the service manager, lookup the “xa:///” scheme:

@Inject @Lookup("xa:///")
XAService _xaManager;

Using the transaction looks like the following:

_xa.begin();
_myService.myMethod(...);
_xa.commit();

The transaction will hold the message until the commit(). At commit time, the manager performs a 2-phase commit. Any @Prepare methods are called in phase 1. If phase 1 completes successfully, the actual methods are called. If phase 1 fails, any @Rollback methods are called.

@Prepare/@Rollback

Transaction-aware services can implement @Prepare and @Rollback methods. @Prepare methods can reserve resources during the prepare phase, and updated during the plain method call. A prepare method looks like:

@Prepare("myMethod")
public void prepareMyMethod()
{
  ...
}

The "myMethod" value of the @Prepare identifies the owning method.

Change List

  • baratine: @OnInit/@OnActive and @OnDestroy changes for @Workers
  • baratine: local transactions
  • kraken: distributed UPDATE
  • kraken: add UPDATE query
  • kraken: add MAP query
  • kraken: map/reduce SELECT
  • kraken: SELECT expressions
  • baratine: add pod-map: scheme
  • baratine: optim @Workers to reduce extra wakeup and prioritize first worker
  • baratine: update @BeforeBatch/@AfterBatch for @Workers
  • baratine: skip @BeforeBatch for @OnInit
  • baratine: change @OnStart to @OnInit, drop @Restore
  • embed: load “event:” and “timer:” services

Baratine 0.8.4 - 2014-07-24

Change List:

  • baratine: com.caucho.baratine.Baratine newManager() for embedded Baratine
  • kraken: reference counting issues on temporary streams when replicating
  • bfs: merge file/dir tables
  • junit: basic support added
  • bfs: add -o for get
  • health: /proc/webapps
  • health: added 6h pdf reports in bfs://local/system/report/cron
  • health: added graphs, profiling, thread dump to report-pdf
  • bfs: add bfs://local support for local-only filesystem
  • pdf: report-pdf with basic graph, scoreboard, thread, profile
  • kelp: search performance, taking advantage of sorted pages
  • kelp: memory GC delegated to JVM. Pages only hard-linked when updated
  • kelp: disk GC change to clean most recent segments and limit copied segments
  • kelp: variable segment sizes so small tables don’t fill disk
  • kelp: large blobs need to be flushed immediately for reliability
  • kraken: temp stream needs to stay in memory for small items
  • bfs: status fixes for dir/file
  • resource: added support for String, long keys in @ServiceResource
  • bfs: use kraken watch support
  • kraken: add watch support
  • bfs: use kraken queries for remoting instead of using table directly
  • kraken: remove timeout/version support on restart and query
  • kraken: move select pod dispatch into SelectQuery from KelpBacking
  • kelp: remove timeout/version to avoid phantom re-insert of old data

Baratine embedding

An embedded Baratine manager can create services in any JVM context. The services can be used for threading, singleton management, or streaming like TCP or websockets or log files.

import io.baratine.core.*;
import com.caucho.baratine.*;

ServiceManager manager = Baratine.newManager().build();

MyBean bean = manager.service(new MyBeanImpl()).as(MyBean.class);

@ServiceResource keys

@ServiceResource can now use String and long keys in addition to the 0.8.3 integer keys:

@ServiceResource("/test/{_id}")
public class MyService {
  private String _id;

  ...
}

The @ServiceResource keys are used in the REST-style services where each URL refers to a different resource instance managed by a single service. Baratine manages the persistence and replication of @ServiceResource instances for scaling and failover.

junit

Baratine now provides a specialized JUnit test runner which starts Baratine server, deploys services and executes JUnit tests. The JUnit Runner is in class com.caucho.junit.RunnerBaratine and is meant to be used with JUnit’s org.junit.runner.RunWith annotation.

RunnerBaratine requires JUnit test class be annotated with com.caucho.junit.ConfigurationBaratine which can be used to specify modules to be deployed to Baratine and configure Baratine port ( optional, defaulting to 8085 ).

Please see JUnit Testing for more detail.

Examples demonstrating testing Baratine services can be obtained from github: http://github.com/baratine

health monitoring

Health monitoring by PDF reports is now supported. See Monitoring and Management.

PDF reports from Baratine’s health monitoring include graphs of memory, CPU, and threading; thread dumps; and continuous CPU profiling. General statistics are gathered each minute from JMX and internal probes, saved to an internal database, and displayed in the PDF report. A low-level profile gathers thread performance each second, and also is included in the report.

To generate a report from the command line, use the report-pdf command:

$ baratine report-pdf
$ baratine report-pdf -o my-report.pdf

Automatic Archived Reports

Every 6 hours new PDF report is generated and saved in the internal filesystem as Day-Hour.pdf like Tue-18.pdf. The reports are saved for a week until the new one replaces the old. This archiving ensures that reports are always available to debug issues after they occur.

To list the saved reports, use the ls command:

$ baratine ls bfs://local/system/report/cron
...
Tue-12.pdf
...

To get a report, use the get command:

$ baratine get bfs://local/system/report/cron/Tue-12.pdf

You can also use the -o option of get to save the report to an alternative name:

$ baratine get -o my-test.pdf bfs://local/system/report/cron/Tue-12.pdf

/proc/webapps

For webapp status, the /proc/webapps shows all the webapps on the current server, their state, and any startup errors. Since each Baratine pod is deployed as a webapp, you can look at /proc/webapps as a quick insight into the server state:

$ baratine cat /proc/webapps

deployment fixes

Fixes to the kelp/kraken database described below also fix issues with Baratine service deployment.

kelp/kraken updates

Variable Segments

The kelp/kraken database size has been reduced by adding multiple sizes for the write-segments, from 256k to 8M. The previous single size of 8M meant even small tables would use tens of megabytes of disk, even if the table itself was tiny.

kelp/kraken is write-only to each segment, with many internal segments inside the mmapped file. The write-only process means a completed segment is fixed until it is garbage-collected, any non-obsolete data is copied to a new segment. The write-only process used several segments before garbage-collection, even for a small table. So in the old system, a small table with 20k of data might use 4 segments, each with 8M, using 32M for the table. With several active tables (Baratine has about 10 of its own), that 32M quickly became 300M.

The new smaller segments for small tables means the 4 active segments for a small table is now 4 times 256k, 1M instead of 32M. Garbage collection has also been updated to more quickly compact the most-frequent updates.

Watches (listeners)

Kelp/Kraken watches are service callbacks that get notified when a table entry changes. Because the watches are in Baratine, the callbacks are run in the registering service’s thread, not the kelp thread; the callback is a Baratine message, not a direct method call.

The watch is used for the Bartender filesystem, specifically for deployment. When a new service jar is uploaded, the system detects the change and deploys it. The watch is the basic of that system. Putting the watch into kelp/kraken improves testability, and avoids re-implementing watches and listeners for every system based on the database.

Garbage Collection

Disk garbage collection is now generational, collecting the most recent segments first, and working through the older segments. Because new segments include delta entries and multiple copies of each page, a new-generation pass will free data more quickly. In addition, it collects hot-spot pages. Pages with less frequent updates settle into long-lived segments.

Memory garbage collection now uses the JVM’s own GC with soft links. The previous version combined weak references with a LRU algorithm of hard links. With the changes the hard links are used only for dirty pages.

Search Performance

Kelp search, used by select queries, now uses the sorted data to avoid scanning all blocks in a leaf page.

Because Kelp pages are write only, like the disk, updates and deletes are overriding row entries in the page, which means that searches need to start from the page beginning for the next item, like get queries. For efficiency, page compaction sorts the compacted entries. With sorting, the search or get skips blocks and entries that previous versions would have tested against.

database: experimental access to kraken/kelp

The kelp/kraken database system is available for experimental access with the “bardb:///” URL, and the DatabaseService API. A introduction is available at database. The query language is similar to SQL. Currently, it is incomplete because only expressions needed by Baratine itself have been implemented.

A select call might look as follows:

@Inject @Lookup("bardb:///") DatabaseService _db;

void select()
{
  String sql = "select id, name, data from test where id=?";

  Cursor cursor = db.findOne(sql, 17);

  if (cursor != null) {
    System.out.println("  id: " + cursor.getInt(1));
    System.out.println("  name: " + cursor.getString(2));
    System.out.println("  data: " + cursor.getString(3));
  }
}

bfs: Bartender filesystem

The internal filesystem used for deployment, /proc reports, and the saved PDF reports is available to Baratine services at the “bfs:” URL with the FileService API that includes both async calls and sync calls for convenience.

Writing to a file looks like:

@Inject @Lookup("bfs://local/home/my-user/test.txt")
FileService _myFile;

...
try (OutputStream os : _myFile.openWrite()) {
  os.write("hello".getBytes());
}

Reading from a file looks like:

@Inject @Lookup("bfs://local/home/my-user/test.txt")
FileService _myFile;

...
try (InputStream is : _myFile.openRead()) {
  int ch;

  while ((ch = is.read()) >= 0) {
    System.out.print((char) ch);
  }
}

Raspberry PI Startup Times

To improve the startup time on Raspberry PI, some work was done to minimize file reads and network access. Because the largest time seems to be loading the classes Baratine uses, and because the Watchdog and the command-line process also incur this overhead, it’s better to use the -fg option with the Linux init.d process instead of using the watchdog.

The command-line for starting Baratine in the foreground looks like:

$ baratine start -fg
...

If that command is placed in an init.d configuration file, the Linux init.d process can manage restarts. In the case of a Raspberry PI, this can reduce the startup time to 40s from the full 90s of a watchdog start.

Baratine 0.8.3 - 2014-06-12

0.8.3 is a major update to the programming model with @ResourceService at it core.

Resource Services

Resource services are persistent and distributed by design. They are intended to follow the REST resource model, where each resource has a unique URL, is persistent, and has a number of methods defined on it.

A singleton resource service has a static URL, and might look like the following:

import io.baratine.core.*;

@ResourceService("public:///my-counter")
public class MyCounter {
  private long _counter;

  @Modify
  public long incrementAndGet()
  {
    return _counter++;
  }
}

The @Modify annotation tells the resource manager which methods modify the resource state. The resource will be saved after @Modify methods are called, but not saved for non-annotated methods.

Multiple Resources

Resource services can have multiple instances, named by the URL. For example, the counter might be named by an integer with a URL like /my-counter/3 as follows:

import io.baratine.core.*;

@ResourceService("public:///my-counter/{_id}")
public class MyCounter {
  private int _id;
  private long _counter;

  @Modify
  public long incrementAndGet()
  {
    return _counter++;
  }
}

Each “/my-counter/5” is a unique resource instance, but managed by the same service thread.

Partitioning

Services are partitioned to multiple servers based on the URL. “/my-counter/1” might go to server 3 while “/my-counter/2” might go to server 1.

The internal resource database is partitioned to match. The primary server for the data for “/my-counter/1” will also be server 3. This coordination means that resource loading and storage is local to the same JVM, making it very fast.

Distribution and Packaging

Distribution has been simplified. The baratine.jar is now sufficient without needing any dependencies.

The default working directory (–root-dir) is now /tmp/baratine, which simplifies testing, development, and experimentation.

There are three main ways of starting a service:

  • bin/baratine convenience script
  • java -jar lib/baratine.jar
  • com.caucho.cli.baratine.BaratineCommandLine

Baratine’s JNI extensions are now packaged inside the jar.

Command Line

The command line has been normalized and simplified. Commands are listed with the ‘help’ command:

$ bin/baratine help

Primary commands:

  • start – starts a baratine server, with options for background/foreground
  • shutdown – shuts down the watchdog and all baratine servers
  • deploy – deploys a new service jar
  • put, get, cat, ls – filesystem commands for deployment into BFS
  • jamp-query – debugging query for a baratine service

command-line shell

The Baratine command-line can also be started in a shell mode, which can be convenient for stress testing, benchmarking or scripts. If bin/baratine is started without a command, it will start a shell. The shell mode can run a script with the -i option:

$ bin/baratine -i stress-test.bsh

An example stress test script might look like:

# start baratine in the script's JVM
start -bg --deploy examples/hello.jar
sleep 2
# warmup run
bench-jamp -n 1000 /hello-service /hello
# benchmark run
bench-jamp -n 100000 /hello-service /hello

/proc

Debugging information about Baratine’s current internal state is available in /proc, accessible with the BFS cat command. In 0.8.3, there are three /proc systems:

* /proc/servers - servers in the cluster
* /proc/pods - pods (distributed services) in the cluster
* /proc/services - services deployed on this machine

For example, /proc/pods/local (the local pod) might show:

$ bin/baratine cat /proc/pods/local
{ "pod" : "local",
  "type" : "solo",
  "sequence" : 0
  "servers" : [
    {"server" : "192.168.1.100:8085"}
  ],
  "nodes" : [
    [0]
  ]
}

source (.git)

Cloning and building baratine looks like the following:

$ git clone git://git.caucho.com/baratine.git
$ ant -f baratine/build.xml
$ ant -f baratine/build.xml native   # optional JNI building
$ java -jar baratine/lib/baratine.jar start

Baratine 0.8.1 - 2014-04-01

  • @OnRestore now allows Result&lt;Boolean> (#5583, rep by J. Willis)
  • segment store auto-size issues (#5582, rep by J. Willis)
  • @PartitionKey
  • MethodRef.partition() and partitionSize()
  • bartender refactor
  • Service-Pod in manifest
  • Allow @Inject for ChannelScoped
  • service clusters enabled and pod://n4.my-pod/my-service
  • added -p, -wp for port and watchdog-port
  • port defaults to serving both cluster and http request
  • deploy refactored to use bfs. .git is now obsolete for deployment
  • bfs refactored to new kraken/kelp
  • Kelp object, string
  • Prototype DatabaseService
  • Kelp multi-table support
  • Added vararg support for proxies
  • Object store added basic query ability
  • added beta cluster (bartender) documentation
  • pod: scheme added
  • champ: scheme added
  • multi-server clusters enabled
  • -conf baratine.cf enabled
  • /tmp/baratine is new root-dir default
  • Kraken backups managed by virtual hashing nodes. Triad still exists as a fallback.
  • Added beta channel documentation
  • ChannelScoped support for websocket (JAMP and HAMP)
  • channel: authenticator and login-required=’false’ regression
  • Changed module import/export to use a consistent module:
  • Added –immediate to shutdown CLI
  • Added @OnStart to work with @OnActive during replay.
  • Added beta journal documentation

Baratine 0.8.0 - 2013-11-04

Initial Release

Baratine 0.8.0 is the first public release of Baratine. We consider 0.8 an early beta. The architecture and APIs will likely change before the stable 1.0 release.

Starting:

$ bin/baratine start -d . examples/hello.jar

$ java -jar examples/hello.jar
hello (RPC): Hello[]
hello (completion): Hello[]

$ curl http://localhost:8085/jamp/hello-service/hello
["Hello[]"]

$ bin/baratine shutdown

Command Line

The command line is currently restrited to four actions: start, shutdown, store-save, and store-load.

  • start - starts Baratine and deploys service jars name on the command line.
  • shutdown - stops Baratine.
  • store-save - saves the named Baratine store.
  • store-load - restores the named Baratine store.

Baratine Store archive format

The Baratine store archive format has changed to use Hessian serialization.

Note: the archive format and the store architecture is a work in progress. We are exploring the option of allowing a more general column-based interface to the underlying store, in addition to the current key/value serialized object store. If we do end up making that change, the archive will need to include table schema information and the values will need to represent the general column types.

The current CLI for the store archiving requires a store name that matches the store: name.:

$ bin/baratine store-save -o backup.dmp my-name
$ bin/baratine store-restore -i backup.dump

@AfterBatch

The message batching annotations have now changed to be @AfterBatch and @BeforeBatch to more clearly explain their purpose.

Message batching is an feature available to any Baratine service, and can be used to batch I/O like TCP sockets or log files, providing an automatic performance boost under heavy load. When a batch of messages arrives, Baratine will deliver them normally to the processing thread, and only call the @AfterBatch once the messages have completed.

@AfterBatch is typically used for flush() operations. @BeforeBatch is used with @AfterBatch when the service needs a context like a ThreadLocal variable. Because Baratine internally

Example: a logging server batching flushes to a log file:

@Service("/log")
class MyLogService {
  private PrintWriter _out;

  public void log(String msg)
  {
    _out.println(msg);
  }

  @AfterBatch
  public void afterBatch()
  {
    _out.flush();
  }
}

When the load to the log service is heavy, writes are buffered before flushing. When the load is light, the log writes are written immediately. The same batching system can be used in other contexts like database writes.

Lifecycle annotation changes

The lifecycle annotations name been renamed to more clearly express their purpose.

  1. @OnRestore - first callback method to restore from the last checkpoint.
  2. replay methods - replay calls from a @Journal
  3. @OnActive - callback when the replay is completed and service active.
  4. @OnCheckpoint - called periodically by the journal to request a checkpoint.
  5. @OnShutdown - called when the service shuts down.

Kelp (Baratine database engine)

Kelp is the database engine that underlies Baratine’s crash-recovery store. It is a btree row-oriented store organized around virtual pages and persisted as a append-only store organized as virtual segments of about 2M. The columns resemble a standard relational database, and include blobs used for all variable-sized data including the serialized objects for the Baratine store. Kelp includes its own journal for database puts using the same underlying mechanism as Baratine @Journal.

Kelp is implemeted as three Baratine services: the main page/btree service, the segment writing service, and the segment garbage-collection service. Since Kelp is designed around Baratine services, it’s a non-locking database. Read-only queries are allowed to jump the queue.

Segment GC refactor

0.8.0 includes Kelp cleanup based on stress tests, primarily around the segment GC and page memory GC. Synchronization between the segment GC service and the page service are managed by segment sequence numbers and by the Baratine services queues. In 0.8.0, the calls from the GC service and the page and segment services were made more strict.

Segment mmap/fsync refactor

Kelp segments are fsync’d before they’re marked as valid for a checkpoint or before GC segments are allowed to reclaim the old segments as free. The mmaped database file is now split into mmap segments each with its own fsync Baratine service. This change is intended as an experiment to see if it’s possible to improve the parallelism and minimize fsync overhead.

Btree index repair

Because Kelp updates are journalled, page updates can be saved to disk lazily until a checkpoint is requested or the page memory GC requires free pages, reducing disk traffic. On btree leaf splits, the lazy saving means that split leaves can be written before the updated tree. This is normal behavior; no data is lost, but the btree index may be inefficient. On restart, Kelp now validates the leaf pages and rebuilds the tree pages if necessary, before the journal is replayed. This validation/repair should have the side effect of making Kelp resilient in case of database corruption.

Kelp Stress Tests

As a baseline of Kelp performance, stress tested reads and writes. The read stress used 1M random keys for both a 10-byte value and a 1024 byte value. The write stress used 100k random keys. Because the keys were chosen from a uniform distribution, the test stresses both the page memory GC and the segment GC.

Test schema:

CREATE test (
  key INTEGER PRIMARY KEY,
  value BLOB
)
Type 10b 1024b
Read 633,000/s 80,000/s
Write 143,000/s 31,000/s

JAMP RPC

The jamp-rpc syntax has changed to match the jamp long-polling syntax.

JAMP-rpc is a basic HTTP interface to @Remote Baratine services. JAMP-rpc now consists of a list of JAMP messages, and a list of responses for any of the queries. Previously, jamp-rpc had only a single message, which was incompatible with the long polling syntax.

A sample JAMP RPC message now looks like:

POST /jamp HTTP/1.0
Content-Type: x-application/jamp-rpc
Content-Length: nnn

[["query", "/from-address", 13, "/to-address", "method", "arg1"]]

And a response looks like:

HTTP/1.0 200 ok
Content-Type: x-application/jamp-rpc

[["reply", "/from-address", 13, "result-value"]]

Multiple messages are now allowed a JSON array of JAMP messages.