How to Do a Tree Activation on AEM as a Cloud Service
Sometimes it’s necessary to publish a whole tree of documents or assets, such as when a whole series of documents is ready to go live, or when you’ve just done an asset migration to AEM as a Cloud Service. The way that one does these bulk activations has changed over the last few years, and you no longer can do a tree replication in the old way on AEM Cloud Service. Here’s what’s changed:
Tree Replication on Previous Version of AEM
In previous versions of AEM (i.e. AEM 6.5 and earlier), when needing to do bulk activation (or “bulk publishing”) one would use the “Tree Replication” or “Tree Activation” functionality in AEM. This would queue up all of the pages or assets that you want to replicate, and would add each of these into AEM’s replication queue.
At this point, you’ve initiated one of the most time-honored and classic ways of setting AEM on fire, as multiple things can and do often happen:
- The replication queue can be loaded up with hundreds, thousands, sometimes HUNDREDS OF THOUSANDS of assets and pages, depending on how large the content tree was that you told AEM to activate, and how many sub-assets or child pages that had to be pulled in.
- The AEM Author can then get massively bogged-down attempting to pull all of that replication traffic into temporary storage to push out to AEM Publishers
- Any current run-of-the-mill page activations that other authors might be doing at the time needs to wait behind this massive queue of replication traffic which might take hours (or even the entire weekend as I’ve seen in some cases)
- If there’s ANY trouble at all with any of the publishers, the queue will hang (one of your many publishers having high CPU, a network or storage blip while one of the items was being replicated, an ill omen is seen in the sky by your staff shaman, ANYTHING) this will hang up the whole replication queue until the issue is handled.
- If there are any permission issues with any of the assets or pages being replicated (i.e. the user publishing them possibly doesn’t have access to the full tree for some reason, or maybe something was done manually to one of the publishers that you didn’t account for), it will hang up the whole queue.
- Or a tree-replication event can cause so many cache invalidation events that the incoming load can then crash publishers while they’re trying to repeatedly activate assets.
Anyone who has run an AEM site for any length of time has very likely run into replication issues and site reliability issues associated with tree replication.
Tree Replication on AEM as a Cloud Service
When AEM as a Cloud Service first launched, tree replication was still available (with limits) and was done in a similar fashion to how it was done on AEM 6.5. (the Tree Replication UI was actually then removed at the end of 2021)
As a first note, you should be aware of the fact that “replication” as it’s known in AEM 6.5 and before is no longer how content gets from Author to Publish. This is now done with Sling Distribution. See this AdaptTo video for more on how that works:
The new mechanism for tree replication is the “Publish Content Tree” workflow.
What this workflow essentially does is:
- Starting the workflow prompts you with a payload of what tree you’re looking to activate (i.e. starting from what level)
- AEM will then kick off the workflow and will batch the tree replication into packages that it then distributes to all publishers using Sling Distribution. It will recursively go through the whole payload path until complete.
Here’s what to do to put it to use: let’s say you’ve just done a migration of assets from your on-prem AEM 6.5 Assets instance to AEM as a Cloud Service. Let’s say you have the path of “/content/dam/arborydigital/images” which has a series of subtrees containing thousands of images that you want to activate. To activate/publish these:
-
Go into your AEM Cloud Service Author instance
-
Go to Tools -> Workflow -> Models
-
Click “CREATE” and then “Create Model” in the top-right
-
Give it a name
-
Click on the workflow model and hit “EDIT”
-
Search for “Process Step” and drag that in as a step in your workflow
-
Click on the process step and hit the wrench icon to configure that step. Click on the “PROCESS” tab on the dialog that comes up.
-
Select the “Publish Content Tree” process, and check the “Handler Advance” checkbox.
-
Add “enableVersion=true,agentId=publish,includeChildren=true” in the arguments. By default, the Publish Content Tree workflow does NOT include children. Adding this argument makes the workflow iterate through the tree and publish all child nodes.
Additionally, if you want to only target Preview with this workflow instead of your Publish tier, you can put “agentId=preview” as the argument instead. (Shout-out to Eric Van Geem for pointing this out!) -
Hit the “SYNC” button in the top right corner
-
Select your new workflow in the Workflow Models list, and click “START WORKFLOW”. In the dialog that pops up, in the “Payload” you can either type or browse to the tree that you want to activate. For example, putting in “/content/dam/arborydigital/images” will then activate ALL nodes under that directory. Hit “RUN”.
-
The workflow will then kick off, and will take anywhere from a few minutes to a few hours to complete, depending on how many assets you just told it to activate.
-
Monitoring: you can then monitor status of your replication event by looking at the `aemerror` log either in Splunk or by tailing the AEM error logs with Adobe IO from the command line.
You’ll see events like this, which indicate that the tree activation packages are rolling out to publish:
20.08.2024 13:53:24.538 [cm-p107857-e1299068-aem-author-6d6b4bddf6-bnhwx] *INFO* [EventAdminAsyncThread #7] org.apache.sling.distribution.journal.impl.publisher.DistributionPublisher [publish] Successfully applied package with id dstrpck-1724161011143-d6d35a6a-d557-4f99-9906-e8eb55e7772c, type ADD, paths [/content/dam/arborydigital/images/smoothing-spline-jmp.jpg, /content/dam/arborydigital/images/screenshots/gallery/blorp.jpg…
Limits: I’ve not run into any limits yet on “how big is too big” for activating assets. While writing this blog post I kicked off a tree replication workflow on AEM Cloud Service for a content tree containing approximately 100GB of assets, and the tree finished before I finished writing the blog post.
Hopefully this works well for you!
About the Author
Tad Reeves
Principal Architect at Arbory Digital
Tad has been working with Adobe products since 2010 and has extensive experience in website infrastructure. Starting in 1996, he has worn nearly every hat in website delivery from solution architecture to product management, and has over two decades of experience. He loves that Arbory gives him the opportunity to provide honest and effective solutions, even if it means challenging prevailing sales perspectives. When Tad isn’t working, he enjoys mountain biking and exploring nature with his wife & 3 kids.
Bringing Clarity to Your AEM World: Related Arbory Digital Podcast Episodes
Are monoliths all bad? What’s the difference between a monolith, composable and microservice based CMS?
How do you solve the constant problem of cache freshness and backend system latency in any modern CMS (especially AEM or Edge Delivery?) In this episode we talk to Michał Cukierman, CTO of Dynamic Solutions and co-founder of StreamX - a digital experience mesh for dramatically and reliably accelerating complicated dynamic content requests from the many constituent systems that make up a modern CMS deployment.
It’s not hyperbole that if you haven’t put considerable effort into rethinking your full site delivery stack in the last few months, you are going to want to. So please - stop reading this right now, pop in some headphones and take this podcast for a walk and consider how it might affect your environment!