Feed operators reference

This article provides information about a collection of activities that are immediately available for manipulation of a feed.

The following sections of this article provide information about these activities that are built into Assemble flow:

  • The symbols in the operator description can be found in the Flow language behavior and syntax article.
  • All the feed operators return the feed DOM node in Atom 1.0 format.
  • Reply activity support RSS 2.0 as output format. See the using Really Simple Syndication (RSS) information for details.

Feed

The Feed activity retrieves a feed against the URL given by the url attribute. It can use the RSS 0.x, RSS1.0, RSS2.0, Atom0.3, and Atom1.0 formats. If the URL is not valid, or if the retrieving a feed operation times out, an empty feed is returned. It has no input and returns an XML node of an Atom 1.0 feed.

<feed name="NCName" url="url" outputVariable="NCName"? >
	control*
</feed>
Attributes
Name Description Required
url An HTTP URL or relative-URL against which the feed is retrieved. Yes
outputVariable output variable No
  • The feed activity can support the RSS 0.9x/1.0/2.0 and Atom 0.3/1.0 feed formats. This activity converts the different kind of feed formats into the Atom 1.0 format to simplify further feed manipulation. All of the other feed operators accept the ATOM 1.0 feed only.
  • You can also get a feed using the built-in GET activity, but the GET activity does not convert the feed result into the Atom 1.0 format, so its output might not be acceptable for the other feed operators.

The following example retrieves a Yahoo!® top story feed.

<process name="feed">    
    <feed url="http://rss.news.yahoo.com/rss/topstories" name="feed_0"/>
    <replyGET name="End">
        <input value="${feed_0}"/>
    </replyGET>
</process>

Following image is feed activity in Assemble tooling:

Sample shown in the tooling

SortFeed

The SortFeed activity is used to sort the feed entries by any entry element, such as the title or the updated element of the entry. The sorting order can be ascending or descending. It has one input with the XML Node type and returns an XML Node of an Atom1.0 feed.

	<sortFeed name="NCName" orderBy="CONDITION" outputVariable="NCName"?/>
	   input
	   control*
	</sortFeed>
Attributes
Name Description Required
orderBy The sort rule uses a semi-colon (;) to separate different rules. Yes
outputVariable Output variable No

The following example sorts feed entries in ascending order by the title value.

<process name="sort">
    <feed url="http://rss.news.yahoo.com/rss/topstories" name="feed_0"/>
    <sortFeed orderBy="+title" name="sortFeed_0">
        <input value="${feed_0}"/>
    </sortFeed>
    <replyGET name="End">
        <input value="${sortFeed_0}"/>
    </replyGET>
</process>

Following image is sortFeed activity in Assemble tooling:

The SortFeed activity

AggregateFeeds

The AggregateFeeds is an activity to merge two or more feeds together into a new one. It can accept two or more inputs and returns an Atom1.0 feed Node.

<AggregateFeeds name="NCName" outputVariable="NCName"? 
		title="new-feed-title" description="feed-description" link="new-feed-link"? >
     input+
    control*
</AggregateFeeds>
  • The input value could be feed returned from the <feed> activity, the feed list, URL or the URL list

Attributes
Name Description Required
title Title of the new feed. The default title can be changed in zero.config:

/config/feed/operators/aggregate#title

Yes
description Description of the new feed. The default description can be changed in zero.config:

/config/feed/operators/aggregate#description

Yes
link Link of the new feed. No
outputVariable Output variable No

The following examples aggregates top story feeds and sports feeds from Yahoo! into a new one.

  • Aggregate the feeds
    <process name="aggregate">    
        <feed url="http://sports.yahoo.com/top/rss.xml" name="feed_0"/>
        <feed url="http://rss.news.yahoo.com/rss/topstories" name="feed_1"/>
        <aggregateFeeds title="Feeds" name="aggregateFeeds_0">
            <input value="${feed_0}"/>
            <input value="${feed_1}"/>
        </aggregateFeeds>
        <replyGET name="End">
            <input value="${aggregateFeeds_0}"/>
        </replyGET>
    </process>
    
  • Aggregate the feeds by URLs
    <process name="aggregate">    
        <aggregateFeeds title="Feeds" name="aggregateFeeds_0">
            <input value="http://sports.yahoo.com/top/rss.xml"/>
            <input value="http://rss.news.yahoo.com/rss/topstories"/>
        </aggregateFeeds>
        <replyGET name="End">
            <input value="${aggregateFeeds_0}"/>
        </replyGET>
    </process>
    
  • Aggregate the feeds by URL list
    <process name="aggregate">
        <variable name="urls" 
            value="${['http://sports.yahoo.com/top/rss.xml', 'http://rss.news.yahoo.com/rss/topstories']}" />   
        <aggregateFeeds title="Feeds" name="aggregateFeeds_0">
            <input value="${urls}"/>
        </aggregateFeeds>
        <replyGET name="End">
            <input value="${aggregateFeeds_0}"/>
        </replyGET>
    </process>
    

Following image shows aggregateFeeds activity in Assemble tooling:

aggregate.JPG

FilterFeed

The FilterFeed activity is used to choose some entries of the input feed. To do simple filtering, you can specify the keywords and the flow engine filters against the title, content, and summary of entries. The FilterFeed activity returns the entries that contain any of the key words. If you want to do advanced filtering, provide a condition that is a valid Boolean XPath1.0 expression. You can also give both. To do this, build up your condition by using the keywords with the AND condition. This has one input with the XML Node type and returns an Atom1.0 feed Node.

<filterFeed name="NCName" keywords="StringList"? condition="XPath1.0-expr"? outputVariable="NCName"? >
       input
      control*
</filterFeed>
Attributes
Name Description Required
keywords The keywords that you want to use to do filtering. Multiple key words should be separated by a comma (,). The value is not case sensitive. No
condition A valid Boolean XPath1.0 expression. No
outputVariable Output variable No

Either keywords or a condition is required. If both keywords and condition are specified, they are logically separated with an OR when filtering.

The following sample filters sports feeds from Yahoo! by the keyword tennis.

<process name="filter">
    <feed url="http://sports.yahoo.com/top/rss.xml" name="feed_0"/>
    <filterFeed keywords="tennis" name="filterFeed_0">
        <input value="${feed_0}"/>
    </filterFeed>
    <replyGET name="End">
        <input value="${filterFeed_0}"/>
    </replyGET>
</process>

Following image shows filterFeed activity in Assemble tooling:

The filterFeed activity

Advanced topic: Extended XPath functions

You can use the extension XPath function for filtering as shown in the following example:

.//atom:entry[zero:containsIgnoreCase(atom:title,'web 2.0')]

The zero prefix of namespace URI http://www.projectzero.org/assemble/flow is predefined in the Assemble flow namespace context. The following extension functions are built-in and supported.

Extension XPath function
name description
zero:containsIgnoreCase boolean containsIgnoreCase(String arg1,String arg2)
zero:equalsIgnoreCase boolean equalsIgnoreCase(String arg1,String arg2)
zero:toLowerCase boolean toLowerCase(String arg1)
zero:toUpperCase boolean toUpperCase(String arg1)

The XPath functions could be extended by registering an XPathFunctionResolver.

For example, if you want to invoke the filter custom XPath function names in the filterFeed activity, the following sample shows the Java™ code of function filter:

package zero.assemble.flow.util;

import javax.xml.namespace.QName;
import javax.xml.xpath.XPathFunction;
import javax.xml.xpath.XPathFunctionResolver;

public class TestXPathFunctionResolver implements XPathFunctionResolver {

	static QName test = new QName("http://www.xyz.com", "filter");
	
	public XPathFunction resolveFunction(QName functionName, int arity) {
		if (test.equals(functionName) && arity == 2) {
			return new MyFilter();
		}
		return null;
	}
}

Then register this function in the zero.config as shown in the following example:

/config/assemble/flow/XPathFunctionResolver += [
	"zero.assemble.flow.util.TestXPathFunctionResolver"
]

The following flow file demonstrates how the custom XPath function filter is used:

<process name="extFilter" expressionLanguage="Groovy" xmlns:ext="http://www.xyz.com" >  
    
	<feed name="feed" url="/feed/rss20.xml" />
		
	<filterFeed name="feedfilter" condition="ext:filter(.)">
	    <input value="${feed}"/>
	</filterFeed>
	
	<replyGET name="reply">	
	  <input value="${feedfilter}"/>
	</replyGET>

</process>

Unique

The Unique activity removes the duplicate entries by the XPath expression. For example, if the input feed has six entries with the same title, you can use the unique activity to remove the duplication with the atom:title function, so that only one of these entries is included in the output feed.

The Unique activity has one input to accept the DOM node of the Atom 1.0 feed, and it returns the DOM Node of the feed without the duplicated entries.

<unique name="NCName" by="CONDITION" outputVariable="NCName"? >
     input
    control*
</unique>
Attributes
Name Description Required
by The element used to be unique by, atom:id as its default value. Yes
outputVariable Output variable No

The following sample creates a unique feed using the value of the title of each entry.

<process name="unique" xmlns:atom="http://www.w3.org/2005/Atom">
    <feed url="http://sports.yahoo.com/top/rss.xml" name="feed_0"/>
    <unique by="atom:title" name="unique_0">
        <input value="${feed_0}"/>
    </unique>
    <replyGET name="End">
        <input value="${unique_0}"/>
    </replyGET>
</process>

Following image shows unique activity in Assemble tooling:

The unique activity.

Truncate

The Truncate activity returns a specific number of the entries from the top of the input feed. This activity allows you to limit the amount of the entries in the output feed. The amount of entries omitted is given by the number attribute. It has one input and returns an XML Node of an Atom1.0 feed.

<truncate name="NCName" number="num" outputVariable="NCName"? >
        input
       control*
</truncate>
Attributes
Name Description Required
number Designates how many entries you want to omit. If it is greater than the total number of the entries, then all entries are omitted. Yes
outputVariable Output variable No

The following sample retrieves the first five entries from the Yahoo! top stories feed.

<process name="truncate">
    <feed url="http://sports.yahoo.com/top/rss.xml" name="feed_0"/>
    <truncate number="5" name="truncate_0">
        <input value="${feed_0}"/>
    </truncate>
    <replyGET name="End">
        <input value="${truncate_0}"/>
    </replyGET>
</process>

Following image shows truncate activity in Assemble tooling:

The truncate activity.

ListEntries

The ListEntries activity accesses a resource specified by the url attribute. If the resource url is valid, it returns an Atom1.0 Feed document, or returns an exception variable with the exception details. No input is required for this operator.

<listEntries name="NCName" url="url">
       control*
</listEntries>

Attributes
Name Description Required
url An HTTP URL or relative-URL of the collection resource. Yes

CreateEntry

The CreateEntry activity creates a member resource in the collection resource specified by the url attribute. If the resource is created successfully, it returns the newly created entry, or it returns an exception variable. Its input must be in the form of an Atom entry and only a single input is allowed.

<createEntry name="NCName" url="url">
       input
       control*
</createEntry>
Attributes
Name Description Required
url An HTTP URL or relative-URL of the collection resource. Yes

RetrieveEntry

The RetrieveEntry activity retrieves the full content of an Atom entry of a member resource. You can specify a resource URL by the url attribute explicitly, or use a partial member resource that contains the member URL as input.

<retrieveEntry name="NCName" url="url"? >
       input?
       control*
</retrieveEntry>
Attributes
Name Description Required
url An HTTP URL or relative-URL of the member resource. No

UpdateEntry

The UpdateEntry activity edits an Atom entry representation of a specific member resource. Enter the updated entry as input data, and this operator stores it over its member resource URL. There is no output variable for this operator if it is updated successfully.

<updateEntry name="NCName">
       input
       control*
</updateEntry>

DeleteEntry

The DeleteEntry activity deletes a specific Atom entry that you placed in input from a collection resource. There is no output variable for this operator if the entry is deleted successfully.

<deleteEntry name="NCName">
       input
       control*
</deleteEntry>

Reply feed in RSS 2.0 format

Although all the feed operators use Atom 1.0 format for manipulation, Assemble flow supports the output in RSS 2.0 format by setting the view attribute with RSS.

The following example aggregates top story feeds and sports feeds from Yahoo! and returns the output in an RSS 2.0 format.

<process name="aggregate">    
    <feed url="http://sports.yahoo.com/top/rss.xml" name="feed_0"/>
    <feed url="http://rss.news.yahoo.com/rss/topstories" name="feed_1"/>
    <aggregateFeeds title="Feeds" name="aggregateFeeds_0">
        <input value="${feed_0}"/>
        <input value="${feed_1}"/>
    </aggregateFeeds>
    <replyGET name="End" view="RSS">
        <input value="${aggregateFeeds_0}"/>
    </replyGET>
</process>

Version 1.1.31300