10 tips for organizing your SAS Enterprise Guide projects

This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post.

What is the best way to organize your SAS work in a SAS Enterprise Guide project?  There are no project templates or enforced structure, really, but isn’t there a best practice?

I don’t have a single prescription for the best project organization.  I believe that it depends on the nature of the work you’re doing, how you’re sharing projects among team members, and on your own personal preferences and working style.  SAS tools are often deliberately flexible, which means the onus is on you to keep the chaos under control.

I can offer some guiding principles though.  These guidelines center along one theme: bring some discipline into your projects.  Don’t let them get all (and this is a technical term) higgledy-piggledy.  Here are my top 10 tips.

1. Arrange process flows (when you have multiple flows in a project) in the order that you expect to run them

Having the process flows arranged in logical order makes it easy to see how the project will be run, and can provide an at-a-glance overview of the project organization.

Note that you don’t have to create the flows in the order that they will be run.  During the design process, you might begin with sample data to design an analysis and output reports, and then go back later to refine the data import process, which naturally would run first.  That’s okay.  You can always rearrange the flows later by dragging each process flow icon in the Project Tree window to the position where it should be.

2. Assign concise, meaningful names to each of your process flows.

“Process Flow” is not a very descriptive label, so make use of the Rename feature to assign a better name.  (Right-click on the name in the Project Tree window and select Rename, or press F2 to go into “rename” mode.)

Some people like to use numbers to indicate the sequence.  For example, a project might contain process flows with names like:  “1. Import Data”, “2. Data description”, “3. Basic analysis”, and so on.  For stable projects, numbered flows can work well.  For projects that undergo frequent changes, the practice of numbering the process flows can cause additional maintenance when you need to insert a new process flow between two others, resulting in a renumbering exercise.

3. Don’t make the flows too big – if a flow gets really long, see if it makes sense to break it up into multiple flows

If you’re a programmer, this concept should make sense.  Think of your project as a big SAS program, and the process flows allow you to organize the program into subroutines for easier maintenance.

You can easily move items from one flow to another.  Simply right-click on an item to move, and select Move to.  To break up an existing process flow, first select File->New->Process Flow to create an empty flow.  Then rename the new flow as appropriate, and begin moving items from the too-complex flow into it.

4. Use Note objects, and link them to the nodes they describe

You can never have too much documentation.  If your project contains SAS programs, you would use SAS code comments to help make your program more readable and maintainable (unless of course, you are deliberately working to avoid that goal).  To document a process flow, add a note with File->New->Note.  The note object allows you to add plain text descriptions to your flow.  You should also rename the note item to provide a meaningful label within the flow.

To make it obvious which items the note describes, link the note to other items in the flow. (Use right-click->Link To, or “draw” the link by clicking near the border of the note icon, when the cursor appears as cross-hairs, and drag the link arrow to the target item.)

For richer documentation, you can add external documents to your process flow, such as PDF or Microsoft Word files.  To add these, select File->Open->Other and browse to the document file to add.  Note that this does not embed the document file within your project; it merely adds a reference, making the document easy to access.  This means that the document must be present in the referenced location or else you won’t be able to open it within the project.

5. Rename tasks, queries, programs using descriptive (but concise) names

If a process flow is like a sentence (as in “language”, not as in “prison”), then a task is like a verb, while a result is like a noun.  But the default names that SAS Enterprise Guide assigns for tasks don’t always describe the actions well.  For example, you might use the query builder to calculate the sum of a variable across categories in a data set.  But the default label on the task might be something like “Query for DC.PROJECTS”.  Well, that could be anything.

Rename the query builder node to reflect the action, such as “Count Projects per State”.  You can still see that it’s a query task from the icon that’s used within the flow.  And if you don’t recognize the icon on sight, you can hover the mouse cursor over the icon to see a tooltip that reminds you what type of task is used for the action.

To rename a node, select the node in the flow and press F2 (to go into “rename” mode), or right-click and select Rename.

6. Rename default output data sets from tasks using shorter, meaningful names

Queries and tasks that create output data sets will use auto-generated names, by default.  The names are often cumbersome and generic, built by combining the task name with the input data name, such as WORK.QUERY_FOR_MART_VENDORSPEND.  Use the options within the task to select a different name that is shorter and more descriptive.

These names aren’t used just for display, but are the physical output data sets that are created when the task runs.  By using shorter names, you can make it easier to find the output data sets in the file dialog later, or to refer to them within SAS programs.

Note: You cannot rename the output data sets within the process flow.  To rename an output data set, you must modify the task that created it, and then re-run the task to refresh the name.  The output data set names are usually controlled in the “results” options for the particular task.

7. Turn on Auto Arrange for a good first pass at layout, then turn it off for manual refinements

For small flows, the Auto Arrange feature of the process flow will present a nice, readable layout.  But as you add multiple “branches” to your flow, you might notice that the process flow diagram grows vertically with lots of white space in between branches.

You can adjust this by turn off Auto Arrange.  Right-click in the process flow and select Auto Arrange, “unchecking” the option from on to off.  Then you can select any of the items within the flow and drag them to where you want them to be on the canvas.

If things get crazy, turn Auto Arrange back on to make your nodes all “snap to”.  But take note: the Auto Arrange setting affects all of the flows within your project, so if you toggle the setting to tweak one flow, you may find that your other flows are also “rearranged” automatically.

8. Change the background color of process flows to make them distinct

Consider using this feature to make it easy to see which flow you’re on.  To change the color of a process flow, right-click on an empty spot on the flow canvas and select Background color.

9. Configure the first process flow to run automatically (“autoexec”)

You can use the Autoexec process flow to enforce initialization of certain libraries, macros, and more.  Read more about the Autoexec process flow here.

10. Remember: a project is like a recipe.

It tells you (and SAS Enterprise Guide and SAS) what the ingredients are and how to combine them, but it doesn’t always contain the ingredients themselves.  The project refers to external pieces, such as data sets, .SAS files (programs) and data files to import.

Keeping your “recipe” clean and organized will increase the chances that you can successfully “cook” with it repeatedly, and that colleagues can use it to repeat your results.

Other resources

Here are a few other resources that can help you to learn how leverage the “project” aspect of SAS Enterprise Guide:

tags: autoexec, best practices, process flows, SAS Enterprise Guide

This post was kindly contributed by The SAS Dummy - go there to comment and to read the full post.