Pivotal Greenplum Workload Manager Documentation | Pivotal

TableofContents
TableofContents
PivotalGreenplumWorkloadManagerDocumentation
AboutGreenplumWorkloadManager
InstallingGreenplumWorkloadManager
ManagingGreenplumWorkloadManagerServices
UsingtheGreenplumWorkloadManagerCommandLine
UsingtheWorkloadManagerGraphicalInterface(gptop)
UsingWorkloadManagerRules
UnderstandingRules
AddingRules
ManagingRules
ExampleRules
BestPracticesforRules
Caveats
QueryingWorkloadManagerRecordData
QueryingWorkloadManagerEventData
ConfiguringWorkloadManagerComponents
Troubleshooting
WorkloadManagerMetricReference
©CopyrightPivotalSoftwareInc,2013-2017
1
1
2
3
4
6
8
11
13
14
19
21
23
26
27
28
30
32
34
35
1.8.2
PivotalGreenplumWorkloadManagerDocumentation
DocumentationforPivotalGreenplumWorkloadManager.
AboutGreenplumWorkloadManager
InstallingGreenplumWorkloadManager
ManagingGreenplumWorkloadManagerServices
UsingtheGreenplumWorkloadManagerCommandLine
UsingtheWorkloadManagerGraphicalInterface(gptop)
UsingWorkloadManagerRules
QueryingWorkloadManagerRecordData
QueryingWorkloadManagerEventData
ConfiguringGreenplumWorkloadManagerComponents
Troubleshooting
WorkloadManagerMetricReference
©CopyrightPivotalSoftwareInc,2013-2017
2
1.8.2
AboutGreenplumWorkloadManager
GreenplumWorkloadManagerisamanagementtoolforGreenplumDatabaseyoucanusetomonitorandmanagequeries.
YoucanuseGreenplumWorkloadManagertoperformtaskslikethese:
MonitorGreenplumDatabasequeriesandhostutilizationstatistics
Logwhenaqueryexceedsathreshold
ThrottletheCPUusageofaquerywhenitexceedsathreshold
Terminateaquery
Detectmemory,CPU,ordiskI/Oskewoccurringduringtheexecutionofaquery
Createdetailedrulestomanagequeries
WorkloadManagerArchitecture
GreenplumWorkloadManagerisasetofGreenplumDatabase-specificpluginsdeployedonanextensiblePivotalframework.Alloftheapplicationlogicis
isolatedintheseplugins.WorkloadManagerprovidesthefollowingplugins:
Agentplugins:
PublishinformationaboutactiveGreenplumDatabasequeries
Publishinformationaboutpostgresprocesses
Advertisequeryterminationcapability
Advertisequerythrottlingcapability
Advertisethresholdloggingcapability
Configurationmanagementplugins:
QuerythestateoftheGreenplumDatabaseclusterperiodically
InformtheframeworkoftheGreenplumDatabaseclusterstateandsizeallowing gp-wlm toautomaticallygrowwhenthedatabaseisexpanded.
Deployconfigurationsthroughoutthecluster.
Command-lineinterfaceplugins:
Add,modify,ordeleterules
Monitorqueriesandskew
Rulesengineplugins:
Provideextendedfunctionalityusedduringrulescreation
Theruntimeframeworkloadsthesepluginsatexecutiontime.
©CopyrightPivotalSoftwareInc,2013-2017
3
1.8.2
InstallingGreenplumWorkloadManager
Prerequisites
RedHatEnterpriseLinux(RHEL)64-bit5.5+or6,CentOS64-bit5.5+or6,orSUSELinuxEnterprise11SP4,64-bit
GreenplumDatabaseversion4.3.13.xorhigher
PivotalGreenplumCommandCenterinstallerforyourplatform

TheGreenplumWorkloadManagerinstallersareincludedinthePivotalGreenplumCommandCenterinstalleryoudownloadfromPivotal
Network .Theinstallerfile, gp-wlm-\-.bin ,isintheGreenplumCommandCenterinstallationdirectory, /usr/local/greenplum-cc-web ,bydefault.
RunningtheGreenplumWorkloadManagerInstaller
GreenplumWorkloadManagerisinstalledontheGreenplumDatabasemasternode.Itautomaticallydistributesthesoftwaretoallsegmentserversinthe
databasecluster.TheinstallerdetectstheinstalledWorkloadManagerversion,ifany,andperformsanupgradeifnecessary.Runtheinstallerwiththe
--force optiontoforcereinstallationofthecurrentversion.
Thepackageinstallerhasthefollowingsyntax:
./gp-wlm-<version>-<platform>.bin--help
./gp-wlm-<version>-<platform>.bin--install=<DIR>[--force][--install-concurrency=<COUNT>]
[--no-remove-old][--skip-health-check][--dbname=<database_name>]
[--tool-manifest=<FILE>]
Options
--help
DisplayscommandsyntaxfortheWorkloadManagerinstaller.
--install=DIR
The --install optionisrequired.ItspecifiesthedirectorywhereGreenplumWorkloadManagerwillbeinstalled,forexample /home/gpadmin .
--force
Ifthe --install optionpointstoanexistingGreenplumWorkloadManagerinstall,theinstallerwillcheckthecurrentlyinstalledversionandperforman
upgradeonlyifthecurrentversionisolderthantheversionbeinginstalled.Ifthe --force optionisspecified,theinstallerwillallowinstallingthesame
versionofGreenplumWorkloadManagerontopofitself.Notethat --force doesnotallowyoutodowngradeGreenplumWorkloadManagertoanearlier
version.
--install-concurrency=COUNT
Themaximumnumberofhoststobootstrapatonce.Thedefaultcountiscomputedbytheinstaller.Thisoptionplacesalimitonthenumberofprocesses
theinstallercanfork.
--no-remove-old
Bydefault,theinstallerremovesallpreviousinstallationdirectoriesafteranupgrade.The --no-remove-old optionpreventstheinstallerfromremovingold
installationdirectories.
--skip-health-check
DonotperformaclusterhealthcheckafterWorkloadManagerinstallationcompletes.Thisoptionisnotrecommended.
--dbname
Thenameofthedatabasewherethe gp_wlm_records tableandthe gp_wlm_events viewarecreated.Thedefaultis postgres .The template0 and template1
databasesmaynotbespecified.Thedatabasemustexistatinstalltime.ThesamedatabasemustbespecifiedwhenupgradingtoanewWorkload
Managerrelease.
--tool-manifest
filename
Theoptional --tool-manifest optionspecifiesatextfilecontainingalistofcommandsandtheirabsolutepaths.WorkloadManagernormallyfindsstandard
systemcommandsonthepath.Ifyourenvironmenthasincompatibleimplementationsofthesecommandsonthepath,createamanifestfilethat
providestheabsolutepathtoastandardversion.
©CopyrightPivotalSoftwareInc,2013-2017
4
1.8.2
Followingisanexampletoolsmanifestfile:
stat=/home/gpadmin/bin/stat
readlink=/bin/readlink
ssh=/home/me/bin/myssh
Theinstallercreatesa gp-wlm-data directoryintheinstallationdirectoryandinstallstheGreenplumWorkloadManagerreleaseintoit.Asymboliclink
gp-wlm intheinstallationdirectorylinkstothespecificGreenplumWorkloadManagerreleasedirectory.
1. LogintotheGreenplummasterhostasthe gpadmin user.
2. EnsurethattheGreenplumWorkloadManagerinstallerisexecutable.
$chmod+xgp-wlm-<version>-<platform>.bin
3. RuntheGreenplumWorkloadManagerinstaller.Specifytheabsolutepathtoaninstallationdirectorywhereyouhavewritepermission.For
example:
$./gp-wlm-<version>-<platform>.bin--install=/home/gpadmin/
ThiscommandinstallsGreenplumWorkloadManagerinthe gp-wlm-data subdirectoryonallofthesegmentsandcreatesthe gp-wlm symboliclink.
Forexample,theabovecommandinstallsWorkloadManagerin /home/gpadmin/gp-wlm-data/gp-wlm-release andcreatesthesymboliclink /home/gpadmin/gpwlm .
Note:Inrarecases,theinstallercanfailduringthe cluster-health-check phase.Iftheclusterisreportednothealthy,re-runtheinstallerwiththe -force option.
4. ToaddtheWorkloadManagerexecutablestoyourpath,source <INSTALL_DIR>/gp-wlm/gp-wlm_path.sh inyourshell.
$source<INSTALL_DIR>/gp-wlm/gp-wlm_path.sh
Youcanaddthe source commandtoyour ~/.bash_profile or ~/.bashrc scripttoincludetheWorkloadManagerexecutablesinyourpathwheneveryou
login.
5. (Optional)Toenablethevmemmetrics,seetheinstructionsintheVmemsectionoftheWorkloadManagerMetricReference.
UninstallingGreenplumWorkloadManager
TouninstallGreenplumWorkloadManager,runthefollowingcommand:
$<INSTALL_DIR>/gp-wlm/bin/uninstall--symlink<INSTALL_DIR>/gp-wlm
©CopyrightPivotalSoftwareInc,2013-2017
5
1.8.2
ManagingGreenplumWorkloadManagerServices
GreenplumWorkloadManagerinstallsandrunsfourservicesonallsegmenthostsintheGreenplumcluster:
agent
cfgmon
rabbitmq
rulesengine
Theservicescanbemanagedusingthe INSTALLDIR/gp-wlm/bin/svc-mgr.sh command.Thecommandhasthefollowingsyntax:
INSTALLDIR/gp-wlm/bin/svc-mgr.sh\
--service=SVCNAME\
--action=ACTION
SVCNAME
maybe agent , cfgmon , rabbitmq , rulesengine , all ,oracombinationofmultipleservicesincomma-separatedform.If SVCNAME specifies
anindividualservice,onlythatserviceismodified.Specify all tomanipulateallservices.
The ACTION parameteraffectsonlythelocalsystem,unlessitisprefixedwith cluster- ,inwhichcaseitrunsonallhostsinthecluster.Theactionsare:
start / cluster-start
stop / cluster-stop
–StartanyoftheWorkloadManagerservicesthatarenotrunning.
–StopanyWorkloadManagerservicesthatarerunning.
status / cluster-status
–Determineiftheservicesarerunning.
restart / cluster-restart
enable / cluster-enable
–RestarttheWorkloadManagerservices.
–EnableandstartWorkloadManagerservices.
disable / cluster-disable
–StopanddisableWorkloadManagerservices.
Ifyousourcethe INSTALLDIR/gp-wlm/gp-wlm_path.sh fileinyourshell,theWorkloadManagerscriptsareinyourpath.Otherwise,youmustprovidethefull
pathtotheutilityinthe gp-wlm/bin directory.
Whenaserviceisstopped,itwillnotberestarteduntilthe start actionisinvoked,orthelocalmachinereboots,whichevercomesfirst.
Whenaserviceisdisabled,itwillnotberestarteduntilthe enable actionisinvoked.Thisispersistentacrossreboot.
ThefollowingexamplechecksthestatusofallWorkloadManagerservicesonthelocalhost:
[gpadmin@mdw~]$svc-mgr.sh--service=all--action=status
RabbitMQisrunningoutofthecurrentinstallation.(PID=22541)
agent(pid22732)isrunning...
cfgmon(pid22858)isrunning...
rulesengine(pid22921)isrunning...
Thefollowingcommandrestartsthe agent and rulesengine WorkloadManagerservicesonallnodesinthecluster:
[gpadmin@mdw~]$svc-mgr.sh--service=agent,rulesengine--action=cluster-restart
CheckingtheHealthofGreenplumWorkloadManagerServices
Atanytime,thehealthofGreenplumWorkloadManagerservicescanbeverifiedacrosstheclusterbyinvokingthe cluster-health-check utility.Thistool
confirmsthatallservicesarerunningacrossthecluster,andthatmessagesarebeingreceivedfromeachmachineinthecluster.Followingisthesyntax
for cluster-health-check :
INSTALLDIR/gpwlm/bin/cluster-health-check--symlink=/absolute/path/to/installation/symlink
[--max-concurrency=N]
[--max-cluster-checks=N]
[--help]
Options: -c or --max-concurrency
The max-concurrency optionspecifiesthenumberofhoststocheckatonce.Thedefaultisacomputedvaluebasedonthenumberofhostsinthecluster:20
iftherearefewerthan100hosts,50ifthereare100to199hosts,and75ifthereare200ormorehosts.
©CopyrightPivotalSoftwareInc,2013-2017
6
1.8.2
-m or --max-cluster-checks
Thenumberoftimestocheckforahealthycluster.Thedefaultis1.
-s
or --symlink
Theabsolutepathtothe gp-wlm directorylinkedtotheinstalledWorkloadManagerrelease.Required.
-h or --help
Displaycommandusageinformationandexit.
Ifthecommandreportsanerrorcommunicatingwithoneormoreservices,theclustermayberestartedwiththiscommand:
INSTALLDIR/gp-wlm/bin/svc-mgr.sh--action=cluster-restart--service=all
ThiscommandstopsandthenrestartseachoftheWorkloadManagerservicesoneachsegmenthost.
©CopyrightPivotalSoftwareInc,2013-2017
7
1.8.2
UsingtheGreenplumWorkloadManagerCommandLine
TheGreenplumWorkloadManagercommandlineutility, gp-wlm ,providesaccesstoWorkloadManagercapabilities.Theutilitymayberunbyentering
commandsinteractivelyorbyspecifyingequivalentactionsusingcommand-lineoptions.Thecommand-lineoptionsareusefulforscripting,sincethey
requirenointeractiveuserinput.
Togethelpininteractivemode,issuethecommand: help
Togethelpforcommandlineinvocation,issuethecommand: gp-wlm-help
Belowisthe gp-wlm commandsyntax:
Usage:gp-wlm[-g|gptop]
[--rule-add=[transient]<name><rule>]
[--rule-delete=all|<name>][--rule-dump=<path>][--rule-import=<path>]
[--rule-modify=[transient]<name><rule>][--rule-restore=<path>]
[--rule-show=all|<name>[<host><domain>]]
[--describe=<metric>]
[--config-show<component><setting>][--config-describe<component><setting>]
[--config-modify<component><setting>=<value>]
[--set-domain=<domain>][--set-host=<host>][--schema-path=<path>]
[--version][--help][--usage]
The gp-wlm command-lineoptionshaveparallelcommandsinthe gp-wlm interactivemode.Theoptiondescriptionsbelowlinktotheinteractivemode
commandsforadditionalusageinformationandexamples.
Options
-g or --gptop
Startsthe gptop graphicaluserinterface.SeeUsingtheWorkloadManagerGraphicalInterface(gptop)formoreabout gptop .
--rule-add
Addsaruletotherulesengine.SeeAddingRulesfordetailsaboutthepartsofaruleandexamples.
--rule-delete
Deletesarulewithaspecifiednameor,byusingthereservedname all ,allcurrentrules.SeeDeletingRulesfordetailsandexamples.
--rule-dump
Savesthecurrentsetofpermanentrulestoanamedfile.SeeSavingRulestoDiskfordetailsandexamples.
--rule-import
Addsrulessavedinanexternalfiletothecurrentruleset.SeeImportingRulesfordetailsandexamples.
--rule-modify
Modifiesarulebyreplacingtheruleexpressionormakingatransientrulepermanent.SeeModifyingRulesfordetails.
--rule-restore
Restorerulesfromanexternalfile,replacingthecurrentrulesintherulesengine.SeeRestoringRulesfordetails.
--rule-show
Displayarulebynameor,byusingthereservedname all ,allcurrentrules.SeeDisplayingRulesfordetailsandexamples.
--config-show
ShowthecurrentvalueofasettingforaWorkloadManagercomponent.SeeConfiguringWorkloadManagerComponentsfordetailsaboutthe
configurationcommands.
--config-describe
DescribethepurposeofasettingforaWorkloadManagercomponentanditsvalueconstraints.
--config-modify
OverridethevalueofasettingforaWorkloadManagercomponent.Thecomponentisautomaticallyrestartedafterasettingisupdated.
--set-domain
Setthedomain,orclustername,forthe gp-wlm interactivesession.Itisrecommendedtousethedefaultdomain.
©CopyrightPivotalSoftwareInc,2013-2017
8
1.8.2
--set-host
Setthehostwherethe gp-wlm sessionruns.Thedefaultisthemachinewhereyourun gp-wlm .Itisrecommendedtoonlyrun gp-wlm onthe
Greenplummasterhost.
--schema-path
Thepathtotheschemafiles.Thedefaultpath, INSTALLDIR/schema ,shouldnotbechanged.
--usage
Displaysusageinformationforthe gp-wlm command.
--help
Displaysusageinformationforthe gp-wlm command.
--describe
Displaysadescriptionofametric.Forexample:
$gp-wlm--describe=datid:numbackends
--version or -v
Displaysthe gp-wlm version.
Usinggp-wlminInteractiveMode
1. Startgp-wlmatthecommandline:
$gp-wlm
The gp-wlm commandpromptdisplaysthenameofthehostwhere gp-wlm isrunningandthenameoftheGreenplumDatabaseclusterordomain.
Enter help attheinteractivepromptforausagemessage.
Whenusingthe gp-wlm command-line:
Entereachcommandonasingleline.Commandsareexecutedimmediately.
Enterthe help commandtoviewalistofWorkloadManagercommands.
Enter describe <metric> toviewadescriptionofametric.
Whileenteringacommand,gethelpwithcommandsyntaxbypressingthetabkeytoshowvalidoptions.Thisisespeciallyusefulwhenconstructinga
rule.Inthefollowingpartialexample,userentryisinbold.
mdw/gpdb-cluster>rule<tab>
adddeletedumpmodifyrestoreshow
mdw/gpdb-cluster>ruleadd<tab>
<rule-name>transient
mdw/gpdb-cluster>ruleaddtransient<tab>
<rule-name>
mdw/gpdb-cluster>ruleaddtransientmyrule<tab>
gpdb_recordhost:pg_terminate_backend
mdw/gpdb-cluster>ruleaddtransientmyrulegpdb_record(<tab>
<dt>)gpdb_segment_rolemessagequery_startusename</dt>
current_queryhostpidsession_id
...
Enterthequitcommandattheprompttoexitthe gp-wlm interactivemode.
SettingtheWorkloadManagerTargetHostandDomain
Usethe sethost and setdomain commandstosetthedefaulthostanddomainfortheWorkloadManagersession.
Itisrecommendedtoonlyrunthe gp-wlm toolontheGreenplumDatabasemasternodeandtoleavethehostanddomainattheirdefaultvalues.
Thedefaulthostisthenameofthemachinewhereyouexecute gp-wlm .ThehostnamemustberesolvableinDNS.Youcanspecifydifferenthostand
clusternamesonthe gp-wlm commandlinebysupplyingthe --set-host and --set-domain commandlineoptions.
©CopyrightPivotalSoftwareInc,2013-2017
9
1.8.2
Example:
mdw/gpdb-cluster>sethostsmdw
smdw/gpdb-cluster>setdomaingpdbsys
smdw/gpdbsys>
©CopyrightPivotalSoftwareInc,2013-2017
10
1.8.2
UsingtheWorkloadManagerGraphicalInterface(gptop)
TheWorkloadManagerGraphicalinterface, gptop ,isacursesinterfacethatyoucanusetomonitorlivedatafortherulesengine,hoststatistics,active
GreenplumDatabasequeries,anddatabaseskew.
Youcanstart gptop fromthecommandlinebyrunning gptop inaterminal.Ifyouarealreadyusinginteractive gp-wlm ,enterthe gptop commandto
enterthemonitor.
Note:IfyouusethePuTTYssh/telnetclientforWindowstorun gptop ,youmayexperienceproblemswithfunctionkeysandline-drawingcharacterswith
thedefaultsettings.Tosupportfunctionkeys,inthePuTTYConfigurationwindow,chooseConnection>Dataandenter xterm-color or putty inthe
Terminal-typestringfield.Toenablecorrectline-drawingcharacters,chooseWindow>TranslationandsetRemotecharactersettoUsefont
encoding.
Whenyoufirststart gptop ,theGPDBQueriespane(seebelow)isselected.Atanytime,youcanpresstheF2keytogetapaneselectionmenu.Usethe
Tab,Left-Arrow,orRight-Arrowkeystomakeaselection.PressF2tocloseanopenmenuwithoutmakingaselection.
Anasterisk( * )nexttoacolumnheadingindicatesthattherowsaresortedbythatcolumn.Tochangethesortorder,presstheF3key,thenchoosethe
numberofthecolumnyouwanttosortbyfromthepop-upmenu.
Press q orchooseFile>Exittoleave gptop .
The gptop monitoringfeaturesareundertheMonitormenu.TheMonitormenuhasfouroptions:
GPDBQueries–ShowsactiveGreenplumDatabasequeries
GPDBSkew–Showsskewstaticsforactivequeries
Hydra–Showsstatisticsfromtherulesengine
SysData–Showsperformancestatisticsforeachhostinthecluster
GPDBQueries
Note:Queriesthatruninunderfivesecondsarenotreportedby gptop inordertominimizeloadonthesystemandtofocusonqueriesconsuming
greaterresources.
TheGPDBQueriesmonitordisplaysalineforeachactiveGreenplumDatabasequery.
SessID
Thesessionidforthequery.
Time
Thenumberofsecondssincethequerybeganexecuting.
User
ThenameoftheGreenplumDatabaserolethatsubmittedthequery.
ClientAddr
Thenetworkaddressfromwhichthequerywassubmitted.
DatName
Thedatabasenamethequeryisrunningagainst.
Query
Thetextofthequery.
GPDBSkew
TheGPDBSkewmonitorshowscalculatedskewstatisticsforactiveGreenplumDatabasequeries.Statisticsarecalculatedoneachhostinthesystemand
thensenttothemasterwheretheyaresummarized.YoucanselectahostandpressEntertoseestatisticsforthehost.Thecalculatedskewvalueisthe
cubedstandarddeviationacrossthecluster.Valuescloserto0.0indicatelessskew.TheGPDBSkewmonitorshowsthefollowingcolumnsforeachactive
©CopyrightPivotalSoftwareInc,2013-2017
11
1.8.2
query:
SessID
TheGreenplumDatabasesessionIDforthequery.
Time
Thenumberofsecondssincethequerystarted.
User
TheGreenplumDatabaserolethatsubmittedthequery.
CPU-Skew
AmeasureofCPUskewcalculatedasthecubedstandarddeviationofthetotalCPUforeachhostforthequery.
MEM-Skew
Ameasureofmemoryskewcalculatedasthecubedstandarddeviationofthetotalresidentsizepercentforeachhostforthequery.
READ-Skew
AmeasureofdiskreadI/Oskewcalculatedasthecubedstandarddeviationofthebytesreadpersecondforeachhostforthequery.
WRITE-Skew
AmeasureofdiskwriteI/Oskewcalculatedasthecubedstandarddeviationofthebyteswrittenpersecondforeachhostforthequery.
©CopyrightPivotalSoftwareInc,2013-2017
12
1.8.2
UsingWorkloadManagerRules
Rulestriggeractionswhentheymatchevents.Theagentpluginsonthesegmenthostscollectstatisticsandassociateddata.Therulesenginematches
themtorules,andperformsthespecifiedactionsintheagentplugins.
UnderstandingRules
AddRuleCommandSyntax
ManagingRules
ExampleRules
BestPracticesforRules
Caveats
©CopyrightPivotalSoftwareInc,2013-2017
13
1.8.2
UnderstandingRules
ThistopicprovidesanintroductiontoWorkloadManagerrulesincludinghowtowritethemandhowtheybehaveinaGreenplumDatabaseclusterwith
WorkloadManager.
RulesOverview
AWorkloadManagerrulespecifiesanactiontoexecutewhenaspecifiedconditionisdetectedintheGreenplumDatabasecluster.Administratorswrite
WorkloadManagerrulestoinvestigateproblemqueries,throttlequeriesthatconsumetoomuchCPU,orsimplyterminatequeriesthatcoulddisruptthe
databasesystem.
The rulesengine serviceoneachGreenplumhostevaluatesrulesagainstfacts,calledmetrics,collectedfromtheGreenplumhostoperatingsystemsand
databaseprocesses.Atregularintervals,metricsarecollectedandsubmittedtothe rulesengine serviceoneachhost.Whentherulesenginematchesa
rule,itperformsitsaction.
Arulehasanactionexpressionandaconditionexpressionseparatedbythe WHEN keyword.Itcanbereadas“do<action-exp>WHEN<condition-exp>”.
Hereisarulethatterminatesanysessionthathasrunforover120seconds:
pg_terminate_backend()whensession_id:host:pid:runtime>120
Intheaboverule,theactionexpressionis pg_terminate_backend() andtheconditionexpressionis session_id:host:pid:runtime>
.
120
Theterm session_id:host:pid:runtime isascopedmetric; runtime isthenameofthemetricand session_id:host:pid isthescope.Thisscopedmetricspecifiesthe
elapsedexecutiontimeforaqueryexecutorprocessonasegmenthost.Thecolon-delimitedsectionsofthescopeandmetricidentifythesourceofthe
value:
session_id
host
pid
–IDofaGreenplumDatabasequery
–thenameofasegmenthost
–processIDofaqueryexecutorprocessrunningonthehost
runtime
–elapsedtimesincethequeryexecutorprocessstarted
Youcreaterulesusingthe ruleadd commandinaninteractive gp-wlm sessionorwiththe --rule-add command-lineoption.Eachrulehasauniquename
usedformanagingtherulewithcommandssuchas rulemodify or ruledelete .
Arulemayalsobelabeled transient ,whichmeanstheruleisactiveonlyuntilitisdeletedorWorkloadManagerisrestarted.
Fordetailsaboutthe ruleadd commandsyntaxandusageseeAddRules.
ForreferenceinformationaboutWorkloadManagercommandsthatmanageexistingrules(modify,delete,dump,import,andrestore),seeManaging
Rules.
Thenextsectionsprovidemoredetailedinformationaboutthecomponentsofarule:actionexpressions,conditionexpressions,metrics,andscopes.
ActionExpression
TheactiontoperformwhenaruleistriggeredisspecifiedwithoneofthefollowingWorkloadManageractions:
gpdb_record
–recordacustommessageanddetailsofthedatabasequeryprocessinthe gp_wlm_records databasetable.
host:throttle_gpdb_query
host:pg_cancel_backend
pg_terminate_backend
–throttleaGreenplumDatabasequeryonaspecifiedhost.
–cancelthecurrentqueryinasessiononahostbycallingthePostgreSQL pg_cancel_backend() function.
–terminateasessionbycallingthePostgreSQL pg_terminate_backend() function.
Arule’sconditionexpressionalwaysidentifiesasinglequeryexecutorprocessonasingleGreenplumsegmenthost.Whenarule’sactionexecutes,itwill
haveinitscontextthequery’ssessionID,asegmenthostname,andtheprocessIDofasinglequeryexecutorprocessonthehost.
EachoftheactionsrespondstothesingleGreenplumDatabasequeryexecutorprocessidentifiedbytheconditionexpression.SeeRuleActionsfor
referenceinformationfortheactions.
©CopyrightPivotalSoftwareInc,2013-2017
14
1.8.2
Actionexpressionsarewrittenasfunctionsandcanhavezeroormorearguments,specifiedwith key=value pairsinparenthesesaftertheactionname:
<action-name>(<arg1>=<value1>,<arg2>=<value2>,...)
gpdb_record
The gpdb_record actionwritesthetextspecifiedinits message argumenttoalogfile,alongwithdetailsofthedatabasequeryprocessidentifiedinthe
rule’sconditionexpression.Forexample,the gpdb_record actioncanlogamessagewhenanyqueryprocessexceeds120seconds:
gpdb_record(message='queryruntimeexceeds120seconds')whensession_id:host:pid:runtime>120
The gp_wlm_records externalGreenplumDatabasetableprovidesSQLqueryaccesstotheloggedrecords.(SeeQueryingthegp_wlm_recordsTable for
moreinformation.)
The gpdb_record actionhasseveralarguments,butonlythe message argumentisrequiredtobespecifiedintherule.Hereisthefulllistofargumentsfor
thisaction:
message
–Informativestringdescribingthereasonforrecording
current_query
-Thetextofthecurrentquery
gpdb_segment_role
host
pid
-Roleofthedatabaseinstance: GPDB_MASTER or GPDB_SEGMENT
-Thehostnameofthesegment
-Thepostgresprocessassociatedwiththequery
query_start
session_id
usename
-Querystarttime
-Sessionidofthequery
-Nameoftheuserloggedintothisbackend
Withtheexceptionof message ,avalueforeachoftheseargumentsisinferredfromthematchedqueryprocess. gpdb_record logsarecordthatincludes
thesuppliedmessage,alloftheseinferredvalues,thetextoftherule,andcontext valuesfromtheconditionexpression.
host:throttle_gpdb_query
The host:throttle_gpdb_query actionholdsaquerytoamaximumshareofCPUonahost,specifiedinthe max_cpu argumentasapercentageofCPU
utilization.
The host: prefixonthe host:throttle_gpdb_query actionisascope.The host: scopeindicatesthattheactionwillbeperformedonlyonthehostmachines
wheretherule’sconditionismatched.The host:throttle_gpdb_query actioniscurrentlytheonlyscopedaction.(Metricsusedintheconditionexpressionare
allscoped.SeeMetricsandScopesbelowfordetails.)
This host:throttle_gpdb_query rulethrottlesaqueryonahostto30%CPUutilization:
host:throttle_gpdb_query(max_cpu=30)whensession_id:host:total_cpu>20
The session_id:host:total_cpu scopedmetricisthetotalpercentageofCPUusedbyallqueryexecutorprocessesonahostworkingonthesamequery.
Notethatthisruleestablishesarangebetween20%and30%CPUutilization.ThrottlingonahostbeginswhentotalCPUutilizationforthequeryexceeds
20%andendswhenitdropsbelow20%.ThrottlingkeepstheCPUutilizationfromexceeding30%.Setting max_cpu argumenthigherthantherule’s
triggerthresholdpreventsrapidlyalternatingbetweenthrottlingenabledandthrottlingdisabledstatesthatcouldoccurifthethresholdandmaximum
CPUareequal.
pg_cancel_backend
The host:pg_cancel_backend actioncancelsaqueryonahost.Itexecutesthe pg_cancel_backend() PostgreSQLfunctiononthesessionmatchedbythe
conditionexpression.
Thefollowingrulecancelsthecurrentqueryinasessionthatexceeds75%totalCPUutilizationonanysegmenthostandhasrunformorethanfive
minutes:
©CopyrightPivotalSoftwareInc,2013-2017
15
1.8.2
host:pg_cancel_backend()whensession_id:host:total_cpu>75andsession_id:host:pid:runtime>300
Whenarulecancelsaquery,WorkloadManagerlogstheeventinalogfileonthesegmenthost.Theseeventrecordscanbequeriedusingthe
gp_wlm_events databaseview.TheviewdependsonGreenplumexternaltablesoneachsegmenthostandmustfirstbesetupusing manage-event-tables.sh
script.SeeQueryingWorkloadManagerEventDatafordetails.
pg_terminate_backend
The pg_terminate_backend actionexecutesthePostgreSQL pg_terminate_backend() functiononthesessionmatchedbytheconditionexpression.Thisisan
unscopedactionbecauseasessionmustbeterminatedonallsegments.
Thefollowingruleterminatesasessionthatexceeds75%totalCPUutilizationonanysegmenthostandhasrunformorethanfiveminutes:
pg_terminate_backend()whensession_id:host:total_cpu>75andsession_id:host:pid:runtime>300
Whenaruleterminatesaquery,WorkloadManagerlogstheeventinalogfileoneachsegmenthost.Theseeventrecordscanbequeriedusingthe
gp_wlm_events databaseview.TheviewdependsonGreenplumexternaltablesoneachsegmenthostandmustfirstbesetupusing manage-event-tables.sh
script.SeeQueryingWorkloadManagerEventData fordetails.
ConditionExpression
Theconditionexpression(predicate)ofaruleisaBooleanexpressionthatidentifiesGreenplumDatabasequeriesyouwanttoactupon.
Metricscanbecomparedtovaluesusingthefollowingoperators.
Operator
ValueFormat
Description
=
Anumberfornumericmetricsoraquotedstringfor
strings.
Matchesonlywhenthevaluesareexactlyequal.
!=
Anumberfornumericmetricsoraquotedstringfor
strings.
Matcheswhenthevaluesarenotequal.
=~
Regularexpressionontherightsideenclosedin
slashes( / ). metric =~ /sel.*by/
Performsaregularexpressionmatchbetweenthestringvalueandthespecified
regex.Posixregularexpressionsyntaxisused.
>
Number
Greaterthan
<
Number
Lessthan
>=
Number
Greaterthanorequalto
<=
Number
Lessthanorequalto
Expressionscanbearbitrarilycomplex,joiningmultiplecomparisonswithBooleanANDandORoperatorsandparenthesestoenforceprecedence.For
example:
host:pid:cpu_util>50or(host:pid:cpu_util>30andsession_id:host:pid:usename="fred")
IncludingClause
The including keywordintroducesacomma-separatedlistofmetricstoaddtothecontextwhenaruletriggers.Anymetricreferencedinthecondition
expressionisautomaticallyaddedtothecontext.Toaddcontextvaluesformetricsnotusedintheconditionexpression,listthemetricsafterthe
including keyword.
Metricsinthe including clausearespecifiedwithoutscopes.Iftherulescompilercannotinferthescopefromscopesalreadyboundintherule,therule
failscompilationwithanerrormessage.
Thefollowingruleaddsthe host:pid:long_name and host:pid:avg_cpu_util metricstothecontext:
gpdb_record(message="CPUover50%")whenhost:pid:cpu_util>50includinglong_name,avg_cpu_util
©CopyrightPivotalSoftwareInc,2013-2017
16
1.8.2
The host:pid:cpu_util metricisinthecontextbecauseitisreferencedintheconditionclause.
Whena gpdb_record actiontriggers,thecontextmetricsareaddedtothe context_args columnofthe gp_wlm_events table.Whena host:pg_cancel_backend or
pg_terminate_backend actiontriggers,thecontextmetricsareaddedtothe context columnofthe gp_wlm_events view.
Theadditionalmetricvaluescanprovideusefulinformationwheninvestigatingrecordedmessagesandterminationevents.
MetricsandScopes
Metricsaredataitemscollectedbytheagent,andincludeoperatingsystemstatistics,OSprocessstatistics,anddatabasequerydata.
WorkloadManagerprovidesarichsetofmetricstouseinconditionexpressionssothatyoucantargetqueriesandqueryprocesseswithveryspecific
characteristics.Forexample,arulecouldtargetqueriesexecutedwithacertaindatabaserolethataccessacertaintableanduseover30%ofCPUonany
host.
Thenameofametricisprefixedbyitsscope,whichprovidescontextforthemetric.The host:pid scopeofthe host:pid:cpu_util metric,forexample,means
thatthe cpu_util metricisthepercentageofCPUusedbyanOSprocess( pid )executingonaspecifichost( host ).The session_id:host:pid scopeforthe
session_id:host:pid:usename metricindicatesthatthe usename metricisthedatabaseroleexecutingaGreenplumDatabasesegmentqueryprocess.The
session_id istheidofthequeryand host isthesegmenthostwherethequeryexecutorprocess, pid ,isexecuting.
Metricsinthe including listofarulearespecifiedwithoutscopes.Therulescompilersearchesforincludedmetricsinscopesalreadyboundinthe
conditionexpressionandfailsifthescopecannotbeinferred.
Rulesmustbewritteninawaytoidentifyasinglequeryexecutorprocessonahost.Thefollowingrulerecordsamessagewhentheresidentmemorysize
foranyprocessexceeds20%.The host:pid scopedoesnotincludea session_id ,soanadditionalrexexptermisaddedtotheconditionexpressionmatch
anyquery.Thisensuresthatthe host:pid:resident_size_pct metricisfromaqueryexecutorprocessandthattheactionhasaknownquerywhenitexecutes.
Withoutthe session_id:host:pid:usename comparison,thisrulewouldfailtocompile.
ruleaddmem_high_segment_useage_20
gpdb_record(message="MEM:highsegmentpctusage-20%")when
host:pid:resident_size_pct>20
andsession_id:host:pid:usename=~/.*/
WorkloadManagerMetricReferencelistsallofthemetrics,theirscopes,andtheirdataformats.
Scope
The datid scopeisformetricsthatarevaluesfromasingledatabaseintheGreenplumDatabasesystem.The datid:datname metric,for
example,canbeusedtorestrictaruletoaspecificdatabase:
datid
...anddataid:datname='my_db'
Metricswith datid scopemustbecombinedintheconditionexpressionwithothermetricsthatidentifyaqueryprocess.
gpdb
Scope
The gpdb scopeisformetricsfromtheentireGreenplumDatabasesystem.Thereiscurrentlyjustonesuchmetric:
gpdb:total_master_connections ,whichisthetotalnumberofclientconnectsforalldatabasesinthesystem.Thismetriccouldbeusedto
preventarulefromtriggeringuntilaspecifiednumberofconnectionsisexceeded.
host
Scope
The host scopeappliestometricsthatarevaluesfromasinglehostintheGreenplumcluster.Theseincludethecurrentdateandtimevaluesfrom
thehostandthehost’stotalCPUutilization.
Scope
The host:segment_id ScopeisusedformetricsfromasingleGreenplumsegment.Itisusedformetricsthatreportthevirtualmemory(vmem)
usageforasegment.
host:segment_id
Scope
The host:pid scopeisformetricsreferringtoanyoperatingsystemprocessonahost.Thesemetricsincludethememory,CPU,andI/Ostatistics
availablefromLinuxforOSprocesses.Metricswith host:pid scopecanbeusedtonarrowaruletoqueryprocessesusingmorehostresources
thanexpected.
host:pid
Scope
A session_id istheGreenplumcluster-wideIDforadatabasequery.Themetricswith session_id scopeareCPUanddiskI/Oskewstatisticsfor
asinglequerythatWorkloadManagercalculatesfromthe host:pid metricsfromallqueryexecutorprocessesonallsegmenthostsforthequery.
session_id
©CopyrightPivotalSoftwareInc,2013-2017
17
1.8.2
Scope
The session_id:host scopeincludesmetricsthatareaggregatedmemory,CPU,andI/Ostatisticsforallprocessesonallhostsrunningaquery.
session_id:host
Scope
The session_id:host:segment_id scopeincludesmetricsthatreporttheamountofvirtualmemory(vmem)consumedbyaGreenplumsegment
forasession.
session_id:host:segment_id
Scope
The session_id:host:pid scopeisusedformetricsthattakevaluesfromaqueryexecutorprocessonasinglesegmenthost.
session_id:host:pid
©CopyrightPivotalSoftwareInc,2013-2017
18
1.8.2
AddingRules
AddRuleCommandSyntax
The ruleadd commandaddsarule.Hereisthesyntaxfortheruleaddcommand:
ruleadd[transient]<name><action-name>(<action-args>)when<expression>[including<metric_list>]
transient
Rulesmaybepersistentortransient.Apersistentruleremainsactiveuntilitisdeleted,whileatransientruledisappearswhentherulesengine
serviceisshutdownonallhosts.Rulesarepersistentbydefault;youmustincludethe transient keywordtocreateatransientrule.
<name>
Auniquenamefortherule.Thename all isreserved.
<action-name>
Theactiontoperform.Oneofthefollowing:
host:throttle_gpdb_query
host:pg_cancel_backend
pg_terminate_backend
gpdb_record
–specifyamaximumallowedCPUutilizationpercentageforaGreenplumDatabasequery.
-cancelthecurrentqueryonahostbycallingthePostgreSQL host:pg_cancel_backend() function.
–terminateasessionbycallingthePostgreSQL pg_terminate_backend() function.
–recordaneventaboutaqueryinthe gp_wlm_records table.
<action-args>
Argumentsthatpassvaluestotheaction,ifneeded.Anargumentisspecifiedasan arg-name=value pair.Multipleargumentsareseparatedby
commas.
when<expression>
ABooleanexpressionthatfilterstargetsfortheaction.Theexpressionreferencesoneormoremetricstofilterthefactsthattriggertheaction.The
expressionmaycontainPosixregularexpressions(regex).
including<metric-list>
Anoptional,comma-separatedlistofmetricstoaddtothecontextwhentheruletriggers.Withoutan including clause,theactioncontext
containsonlyvaluesformetricsreferencedinthe expression clause.Addthe including clausetoaddvaluesforadditionalmetricstothe
actioncontext.
Metricsinthe<metric_list>arespecifiedwithoutscopeprefixes.IftheWorkloadManagerrulecompilercannotfindametricinanycurrentlybound
scope,addingtherulefailswithanerrormessage.
When gpdb_record , host:pg_cancel_backend ,and pg_terminate_backend actionsaretriggered,themetricsin<metric-list>areaddedtothecontext
argumentscolumnsinthe gp_wlm_records tableor gp_wlm_events view.
Ametricinthe including clauseisnotaddedtothecontextargumentscolumnsifitisalreadypresentasaseparatecolumn.Forexample,the
usename metrichasitsowncolumn,soaddingthismetrictothe including clausehasnoeffect.
Whenthefollowingruleactionistriggeredbyaquerythatrunslongerthan10minutes,thevaluesof total_cpu and spillfile_size_across_cluster metrics
arerecordedandshowninthecontext:
mdw/gpdb-cluster>ruleaddmyrulegpdb_record(message="richcontext")
whensession_id:host:pid:runtime>600
includingtotal_cpu,spillfile_size_across_cluster
RuleActions
host:throttle_gpdb_query
ThrottleaGreenplumDatabasequeryonaspecifiedhost.
©CopyrightPivotalSoftwareInc,2013-2017
19
1.8.2
Arguments:
max_cpu
pid
-HoldprocesstoamaximumofthispercentageCPUutilization.
-Theprocesstothrottle.
session_id
-Thesessiontothrottle.
The max_cpu argumentisrequired.The pid and session_id argumentscanbeinferredfromthesession_idinthewhenclauseandarenormallyomitted.
host:pg_cancel_backend
Cancelaqueryonahost.ThisactioncallsthePostgreSQL pg_cancel_backend administrativefunction.
Arguments:
session_id
–ThesessionIDofthequerytoterminate.
Theargumentisnormallyomitted,allowingthesessionIDtobeinferredbyusingthesession_idintherule’swhenclause.WorkloadManagerthen
determineswhichsessiontocancel.TheactionsendsaSIGINTsignaltothesessionprocess,whichcancelsthecurrentquery.See
http://www.postgresql.org/docs/9.3/static/functions-admin.html formoredetails.
Thefollowingexamplecancelsthecurrentqueryinanysessionthathasbeenexecutingformorethan20seconds:
mdw/gpdb-cluster>ruleaddcancel_queryhost:pg_cancel_backend()
whensession_id:host:pid:runtime>20
pg_terminate_backend
Terminateasessiononallhosts.ThisactioncallsthePostgreSQL pg_terminate_backend administrativefunction.
Arguments:
session_id
–ThesessionIDtoterminate.
Theargumentisnormallyomitted,allowingthesessionIDtobeinferredbyusingthesession_idmatchedbyrule’swhenclause.WorkloadManagerthen
determineswhichpidtoterminate.Seehttp://www.postgresql.org/docs/9.3/static/functions-admin.html formoredetails.
Thefollowingexampleterminatesanysessionthathasbeenexecutingformorethan20seconds:
mdw/gpdb-cluster>ruleaddcancel_sessionpg_terminate_backend()
whensession_id:host:pid:runtime>20
gpdb_record
Logsamessagetothe gp_wlm_records tablewhenaruleismatched.
Arguments:
message
–Informativestringdescribingthereasonforrecording.
Thefollowingexamplelogsallqueries:
mdw/gpdb_cluster>ruleaddrecord_querygpdb_record(message="all")whensession_id:host:pid:usename=~/.*/
SeeQueryingthegp_wlm_recordsTable forinformationaboutthe gp_wlm_records table.
©CopyrightPivotalSoftwareInc,2013-2017
20
1.8.2
ManagingRules
Usingcommandsdescribedinthistopic,rulescanbedisplayed,deleted,modified,andsavedtoorrestoredfromdisk.Eachofthecommandshasa
gp-wlm command-lineequivalent.
DisplayingRules
Usethe rule
commandtoseeexistingrules.Youcanshowallexistingrulesorspecifyasinglerulebyname.
show
ruleshow{all|rule-name}
The ruleshow
commandinthisexamplelistsallregisteredrules:
all
mdw/gpdb-cluster>ruleshowall
---Name--------------Expression----------record_querygpdb_record(message="all")whensession_id:host:pid:usename=~/.*/
cancel_querypg_terminate_backend()whensession_id:host:pid:runtime>20
throttle_queryhost:throttle_gpdb_query(max_cpu=20)whensession_id:host:pid:current_query=~/.*selectcount.*/
Thisexamplelistsasinglerulebyname:
mdw/gpdb-cluster>ruleshowthrottle_query
---Name--------------Expression----------throttle_queryhost:throttle_gpdb_query(max_cpu=20)whensession_id:host:pid:current_query=~/.*selectcount.*/
DeletingRules
Theruledeletecommandremovesarule.
ruledeleterule-name
Todeleteallrulesatonce,use ruledeleteall :
ruledeleteall
Iftherearenorules,thiscommandreturnsanerror.
ModifyingaRule
Usethe rulemodify commandtoaltertheexpressionforanexistingrule.Youmayalsoremovethetransientkeywordfromtheruledeclarationtoconvert
ittoapersistentrule.Conversionfrompersistenttotransientisnotcurrentlysupported.
rulemodify[transient]nameaction-name(action-args)
whenexpression
Thisexamplemodifiesthe cancel_query ruletoalterthenumberofsecondsaqueryrunsonahosttotriggertherulefrom20to25:
mdw/gpdb-cluster>rulemodifycancel_querypg_terminate_backend()whensession_id:host:pid:runtime>25
SavingRulestoDisk
The ruledump commandsavesallpersistentrulesintheclustertoatextfile,oneruleperline.
©CopyrightPivotalSoftwareInc,2013-2017
21
1.8.2
ruledumppath
Ifyoudonotprovidethefullpathtothefile,thefileiswrittenrelativetothedirectorywhereyoustartedthe gp-wlm session.Theuserrunning gp-wlm
musthavepermissiontowritethefileatthespecifiedlocation.Ifthefileexists,the ruledump commandoverwritesit.
Thefollowingexamplesavesrulestothe /home/gpadmin/rules/20150910-1.txt file.Ifthe /home/gpadmin/rules directorydoesnotexist,anerrorisreported.
mdw/gpdb-cluster>ruledump/home/gpadmin/rules/20150910-1.txt
ImportingRulesfromDisk
The ruleimport commandimportsrulesfromafileintotheactivesetofrules.Importedrulesreplaceexistingruleswiththesamenames.Existingrules
withnamesnotpresentinthefileareunchanged.
ruleimportpath
RestoringRulesfromDisk
The rule
commandrestoresallrulesfromafile,replacinganyexistingrules.Itisequivalentto rule
restore
delete=all
followedby ruleimport
.
path
rulerestorepath
©CopyrightPivotalSoftwareInc,2013-2017
22
1.8.2
ExampleRules
Thissectionprovidesexamplesofruleswrittenforvariouspurposes.
Note:Rulesmustbeenteredonasingleline,buttherulesshowninthissectionarewrappedforreadability.
Recordhighcpuutilizationqueries
Thefollowingruleinvokesthe gpdb_record actionwhenthegpadminuserrunsaqueryanditstotalcpuutilizationonahostexceeds100%.
ruleaddsimplegpdb_record(message="Toomuchcpuforgpadmin")
whensession_id:host:total_cpu>100
andsession_id:host:pid:usename='gpadmin'
Complexrule
Thisruleinvokes gpdb_record foraquerythatmeetsthefollowingcriteria:
aqueryhastotalCPUusagegreaterthan90%onahostandhasbeenrunningformorethan45seconds,or
hascpuskewgreaterthan20%,and
isaselectonatablethatcontains“test”initsname.
ruleaddcomborulegpdb_record(message="MyMessage")
when((session_id:host:total_cpu>90andsession_id:host:pid:runtime>45)
orsession_id:cpu_skew>20)
andsession_id:host:pid:current_query=~/select.*test/
TheruleshowshowyoucangroupBooleanexpressionswithparentheses.
Recordquerieswithhighmemoryusage
Thisrulerecordsamessagewhenaqueryprocessexceeds20%oftheresidentmemoryonahost.
ruleaddtransientmem_high_segment_useage_20
gpdb_record(message="MEM:highsegmentpctusage-20%")when
host:pid:resident_size_pct>20
andsession_id:host:pid:usename=~/.*/
Recordquerieswithmemory(rss)skewabove10%
Thisrulecallsthe gpdb_record actiontologwhenmemoryskewexceeds10%onahost.
ruleaddmem_skew_10gpdb_record(message="MEM:queryskew10")
whensession_id:resident_size_pct_skew>10
andsession_id:host:pid:usename=~/.*/
RecordhighCPUqueriesonahostwhenoverallCPUutilizationishighonthat
host
ThisrulerecordsqueriesthatareusinguptheCPUonahostespeciallywhentheoverallCPUutilizationonahostishigh.
©CopyrightPivotalSoftwareInc,2013-2017
23
1.8.2
ruleaddhigh_query_cpu_on_hostgpdb_record(message="HighqueryCPUonhost")when
session_id:host:total_cpu>60and
host:node_cpu_util>80and
session_id:host:pid:usename=~/.*/
RecordhighCPUqueryprocesseswhenoverallCPUutilizationonahostishigh
ThisrulerecordsprocessesthatareusinguptheCPUespeciallywhentheoverallCPUutilizationonahostishigh.
ruleaddhigh_cpugpdb_record(message="HighCPUusage")when
host:pid:cpu_util>10and
host:node_cpu_util>80and
session_id:host:pid:runtime>0
Recordquerieswithhighspillfilecount
Thisrulerecordstotalnumberofspillfilescreatedforaqueryacrosstheclusterwhenitexceedsthespecifiedlimit.
ruleaddspillsgpdb_record(message="Highspillfilecount")when
session_id:host:pid:spillfile_count_across_cluster>2500
Recordquerieswithhighvmemusagewhensegmentvmemusageishighas
well
Thisrulerecordsvmemusageofqueriesandsegmentswhentheybothexceedspecifiedlimits.Thisisarulethatcanbeusedwithakillqueryactionwhen
thebehaviorof runaway_detector_activation_percent ,whichistokillthequerythatconsumesthehighestamountofmemory,isnotdesirable.Itis
recommendedtoturnoff runaway_detector_activation_percent ifyouintendtokillquerieswiththisrule.
Thisquerycanbefurtherrefinedtoselectorfilteroutspecificusers,applications,databases,etc.
ruleaddhigh_vmemgpdb_record(message="Highsegmentandqueryvmemusage")when
host:segment_id:total_vmem_size_pct>50and
session_id:host:segment_id:vmem_size_pct>5and
session_id:host:pid:runtime>0
Recordnumberofbyteswrittentodiskonahostbyanyqueryprocess
Thisrulerecordsthenumberofbyteswrittentodiskonahostbyanyqueryprocess.
ruleadddisk_writegpdb_record(message='disk')when
host:pid:disk_write_bytes>0andsession_id:host:pid:datname='mydb'
Recordtotalnumberofbyteswrittentodiskperseconahostbyallquery
processes
Thisrulerecordsthetotalnumberofbyteswrittentodiskpersecondonahostbyallqueryprocesses.
ruleadddisk_write_per_secgpdb_record(message='diskpersec')when
session_id:host:total_disk_write_bytes_per_sec>0and
session_id:host:pid:application_name=~/my_app/
©CopyrightPivotalSoftwareInc,2013-2017
24
1.8.2
Cancelanyquerywherethesessionhasrunlongerthan120seconds
Thisruleinvokesthe host:pg_cancel_backend actionwhena session_id:host:pid:runtime exceedstwominutes.
ruleaddkill_longhost:pg_cancel_backend()
whensession_id:host:pid:runtime>120
Throttlethecpuutilizationofaquery
Thisruleinvokesthe host:throttle_gpdb_query actionwhenthecpuutilizationofaprocessexceedsathresholdandthequeryhasrunformorethan20
seconds.
ruleaddthrottlehost:throttle_gpdb_query(max_cpu=30)
whenhost:pid:cpu_util>20
andsession_id:host:pid:usename='gpadmin'
andsession_id:host:pid:runtime>20
Throttleandevenoutskew
Thisruleinvokes host:throttle_gpdb_query whenthetotalcpuusageofaqueryonahostexceeds90%andthecurrentqueryisaselectontheskewtesttable.
ruleaddskewrulehost:throttle_gpdb_query(max_cpu=50)
whensession_id:host:total_cpu>100
andsession_id:host:pid:current_query=~/select.*skewtest/
Youcanobservetheeffectsofthisruleinthe gptop GPDBSkewpage.
©CopyrightPivotalSoftwareInc,2013-2017
25
1.8.2
BestPracticesforRules
1. Avoidcreatingrulesthatmodifytheconditiontherule’sexpressionismatching.Forexample,considerthisrule:
host:throttle_gpdb_query(max_cpu=20)whenhost:pid:cpu_util>30andsession_id:host:pid:runtim>0
IfCPUusagegoesabove30%,theruletriggersandreducestheusageto20%.Whentheusagefallsbelow30%,theruleisnolongermatched,sothe
throttlingendsandusagecanagainclimbto30%.Thiscreatesanundesirablecyclicbehavior.Instead,createarulelikethefollowing:
host:throttle_gpdb_query(max_cpu=30)whenhost:pid:cpu_util>20
andsession_id:host:pid:runtime>0
Thisruletriggersat20%CPUutilizationandthrottlestheCPUto30%utilization.Thethrottlingcontinuesuntilutilizationdropsbelow20%.The
session_id:host:pid:runtime conditionistrueforanyrunningqueryandprovidesthenecessary session_id forthe throttle_gpdb_query action.
2. Avoidcreatingrulesthatterminateaquerybasedonskewalone.Considerthefollowingrule:
pg_terminate_backendwhensession_id:resident_size_pct_skew>10
Thisisapoorrulefortworeasons.First,itterminatesallquerieswhenskewisabove10,includingqueriesthatwerenotcontributingtoskew.
Second,wellbehavedqueriescantemporarilyexperienceskewhighenoughtoachievethiscondition.Forexample,ifthesegmentsdonotcomplete
aqueryatthesametime,skewcanappearneartheendofexecution.Aquerycouldrunnormallyacrossseveralnodesandthen,aseachnode
completesitsportionofthequery,itsresourceutilizationdrops,causingatemporaryincreaseinskewwhileothernodesarestillrunning.
3. Rulesthatmatchdatawith datid: scopewilltriggerforanydatabaseintheclusterunlessapredicateisaddedtoconfinethematchtoatarget
database.Forexample,thisruletriggerswheneverthenumberofconnectionstoanysingledatabaseexceeds10:
gpdb_record(message="exceeded10connections")
whensession_id:host:pid:runtime>0
anddatid:numbackends>10
Addapredicatetofilterforthedatabaseassociatedwiththesession:
gpdb_record(message="exceeded10connectionsonfoo")
whensession_id:host:pid:runtime>0
anddatid:datname="foo"
anddatid:numbackends>10
©CopyrightPivotalSoftwareInc,2013-2017
26
1.8.2
Caveats
RuleConditionsMustIncludeasession_id
TowritearulethatperformsaGreenplumDatabaseaction( gpdb_record , pg_terminate_backend , host:throttle_gpdb_query ),theconditionmustincludea
session_id ,evenwhentheintendedconditionisbasedsolelyonprocessinformation.Forexample,thefollowingruleappearstoterminateanyquerythat
usesmorethan20%ofsystemmemory:
pg_terminate_backend()whenhost:pid:resident_size_pct>20
However,becausethisrulecontainsno session_id ,WorkloadManagercannotinferthequerytoterminate,andtherulewillnotbeadded.Togetthe
desiredbehavior,addanalways-true session_id conditiontotherule,forexample:
pg_terminate_backend()whenhost:pid:program_size_pct>20
andsession_id:host:pid:runtime>0
QueriesExecutinginUnderFiveSecondsareIgnored
QueriesthatrunforlessthanfivesecondsareignoredbyWorkloadManagerinordertominimizeloadonthesystemandtohelpfocusonqueriesthat
consumegreaterresources.
AvoidRaceConditionsWhenUsingVmemMetrics
Inrareconditions,ifmemoryallocatedforasegmentisclosetoexceeding gp_vmem_protect_limit or runaway_detector_activation_percent ,aquerythattriggers
theselimitsmaybekilledbythevmemprotectorbeforeWorkloadManagercancancelanotherquerythathasmetavmem-relatedWorkloadManager
rulecondition.
Forexample,queryQ1maybeanimportantquerythatconsumesasignificantamountofmemory.WorkloadManagerwantstoprotectQ1bykillingother
lessimportantqueries,Q2andQ3,whichconsumelessmemory.Ifthetotalmemoryusageforasegmentrunningthesequeriesiscloseto
runaway_detector_activation_percent andWorkloadManagerdecidestokillQ2andQ3attimet,Q1maybekilledduetosegmentmemoryexceeding
runaway_detector_activation_percent attimet+1,andQ2andQ3maybekilledbyWorkloadManagerattimet+2basedonthedecisionmadeattimet.This
issuecanbeavoidedbydisabling runaway_detector_activation_percent andensuringaWorkloadManagerruletriggerswellbefore vmem_protect_limit canbe
reached.The host:segment_id:total_vmem_size_pct and session_id:host:segment_id:vmem_size_pct metricscanbeusedforthispurpose.Hereisanexamplerule:
cancel_Q2_vmem_exceedhost:pg_cancel_backend()when
host:segment_id:total_vmem_size_pct>65and
session_id:host:segment_id:vmem_size_pct>5and
session_id:host:pid:current_query=~/Q2/
Ifyouwouldliketousethesevmemmetrics,besuretoenablethemasdescribedintheVmemsectionoftheWorkloadManagerMetricReference.
©CopyrightPivotalSoftwareInc,2013-2017
27
1.8.2
QueryingWorkloadManagerRecordData
The gp_wlm_records tablecontainsarecordofeventsdescribingwhere,why,andhowthe gpdb_record actionwastriggeredbyaruleontheGreenplum
cluster.
The gp_wlm_records tableiscreatedinthe postgres databasebydefault.Adifferentdatabasecanbespecifiedatinstallationtimewiththe --dbname-records
installationoption.
Thetablehasthefollowingstructure:
Column
Type
Description
id
text
Auniqueidentifierforeachtimearulematchesaquery.Eachuniquevalueexistsinexactlytworows:onein
whichthevalueforthestatecolumnis BEGIN andtheotherinwhichthevalueforthestatecolumnis END .The
BEGIN rowindicateswhenaruleconditionbeginstomatchaqueryandthe END rowwhentheconditionends
matching.
rule_state
text
time
timestamptz
Thetime(withtimezone)therecordwascreated.
query_start
timestamptz
Thetimethequerystartedexecuting.
rulename
text
ThenameoftheWorkloadMangagerrulethatwasmatched.
session_id
integer
IDofthesessionthatwasrunningthematchedquery.
pid
integer
IDoftheprocessthatwasrunningthematchedquery.
hostname
text
Thehostonwhichtheeventoccurred.
username
text
Therolenamefromthesessionthatmatchedthisruletrigger.
db_name
text
Thenameofthedatabase.
application_name
text
Thenameoftheclientapplicationthatexecutedthequery,forexample psql .
context
text
Acomma-delimitedlistofrule-specificcontextualmetrics.
gpdb_segment_role
text
Theroleofthesegmentthatmatchedthecondition.Itcanbeoneofthreepossiblevalues: GPDB_MASTER ,
GPDB_SEGMENT ,or GPDB_MIRROR .
message
text
Themessagethatwaspassedasaparametertothe gpdb_record action.
rule
text
Theruleexpression.
current_query
text
Thetextofthecurrentqueryinthesession.
Thestateoftherule.Possiblevaluesare BEGIN and END .WLMcreatesarecordwithastateof BEGIN whena
querybeginstomatcharule,andasecondrecordwithastateof END whenthequerynolongermatches.
Theprimaryidentifierofeachentryinthetableisthe id column.Thiscolumnstoresauniqueidentifierthatrepresentsaspecificrulethattriggeredona
specificnodeinthecluster.Ifaruletriggersonmorethanonenodeintheclusteratthesametime,eachnodeistreatedasaseparateeventandreceives
auniqueidentifier.
Followingaretwosampleentriesfromthe gp_wlm_records table.Inthisexample,arulewascreatedtotrackwhenaqueryrunsformorethan120seconds:
©CopyrightPivotalSoftwareInc,2013-2017
28
1.8.2
=#\xon
Expandeddisplayison.
=#select*fromgp_wlm_records;
-[RECORD1]-----+----------------------------------------------------------------id|7a3e65b6-cd89-40ba-83ae-a502b7c480cf
rule_state|BEGIN
time|2017-06-2317:38:38-05
query_start|2017-06-2317:38:29.502544-05
rulename|over120
session_id|27194
pid|118516
hostname|mdw
username|gpadmin
db_name|postgres
application_name|psql
context|runtime=121
gpdb_segment_role|GPDB_MASTER
message|Queryexceeds120seconds.
rule|gpdb_record(message="Queryexceeds120seconds.")whensession_id:host:pid:runtime>120
current_query|deletefromtestwheref1()
-[RECORD2]-----+----------------------------------------------------------------id|7a3e65b6-cd89-40ba-83ae-a502b7c480cf
rule_state|END
time|2017-06-2317:39:20-05
query_start|
rulename|
session_id|
pid|
hostname|
username|
db_name|
application_name|
context|
gpdb_segment_role|
message|
rule|
current_query|
Intheaboveexample,the rule_state columnrepresentswhenaquerybegantriggeringaruleonagivennodeandwhenitstopped.The hostname column
storesthehostonwhichtheruletriggered.
©CopyrightPivotalSoftwareInc,2013-2017
29
1.8.2
QueryingWorkloadManagerEventData
WhenaWorkloadManagerrulesuccessfullyexecutesa pg_terminate_backend() or host:pg_cancel_backend() actiontocancelaGreenplumDatabasequery,the
eventisloggedtoafileonthehost.Youcanusethe gp_wlm_events viewtoquerytheloggedevents.The gp_wlm_events tableiscreatedduringinstallation
inthedatabasespecifiedwiththeWorkloadManagerinstaller --dbname command-lineoption.Thedefaultisthe postgres database.
Thefollowingtabledescribesthecontentsofthe gp_wlm_events view.
Columnname
Type
Description
id
text
Auniqueidentifierforeachrow.
time
timestamptz
Thetime(withtimezone)theeventrecordwascreated.
rulename
text
Thenameofthetriggeredrule.
action
text
Thecomponentthattriggeredtheevent.
session_id
integer
TheIDofthesessionthatmatchedthisruletrigger.
hostname
text
Thehostonwhichtheeventoccurred.
username
text
Therolenamefromthesessionthatmatchedthisruletrigger.
db_name
text
Thedatabasenameofthesession.
application_name
text
Thenameoftheclientapplicationofthesessionthatmatchedthisruletrigger.
context
text
Acomma-delimitedlistofrule-specificcontextualmetrics.
rule
text
Theruleexpression.
current_query
text
Thetextofthecurrentqueryinthesession.
The session_id , username , current_query , db_name ,and application_name columnsmatchcolumnswithsimilarnamesinthe pg_stat_activity systemview
rowfortheprocessthatmatchedtheruletrigger.Seepg_stat_activity .
Sincetheviewisbasedonexternaltables,eachtimeyourunaquery,theviewisrefreshedfromtheeventlogsontheGreenplumhosts.
Followingisanexampleof pg_cancel_backend and pg_terminate_backend rowsinthe gp_wlm_events view:
postgres=#select*fromgp_wlm_events;
-[RECORD1]----+-------------------------------------------------------------------------------id|e7054d71-293b-4bce-a3bb-caafbcbc6758
time|2017-07-3119:31:02-08
rulename|test
action|pg_cancel_backend
session_id|4200
hostname|localhost.localdomain
username|pivotal
db_name|postgres
application_name|psql
context|runtime=6,host=localhost.localdomain,session_id=4200,host=localhost.localdomain
rule|host:pg_cancel_backend()whensession_id:host:pid:runtime>5
current_query|selectpg_sleep(10);
-[RECORD2]----+-------------------------------------------------------------------------------id|0c1f50dd-e1fc-4dd8-9829-7e450f74fde8
time|2017-07-3119:37:30-08
rulename|test2
action|pg_terminate_backend
session_id|4226
hostname|localhost.localdomain
username|pivotal
db_name|postgres
application_name|psql
context|runtime=8,session_id=4226
rule|pg_terminate_backend()whensession_id:host:pid:runtime>5
current_query|<IDLE>
ToseetheoldCSVeventfiles,runthiscommand,withahostfilecontainingthenamesofallsegmenthosts.See gpssh intheGreenplumDatabaseUtility
Guideforinstructionstocreateahostfile.
$gpssh-f<hostfile>-e"find<INSTALL_DIR>/gp-wlm/-name'events*.csv'-execls{}\;"
TodeletetheoldCSVeventfiles,runthiscommand:
©CopyrightPivotalSoftwareInc,2013-2017
30
1.8.2
$gpssh-f<hostfile>-e"find<INSTALL_DIR>/gp-wlm/-name'events*.csv'-execrm{}\;"
©CopyrightPivotalSoftwareInc,2013-2017
31
1.8.2
ConfiguringWorkloadManagerComponents
YoucanusetheGreenplumWorkloadManager config commandtoview,override,anddescribecertainWorkloadManagerconfigurationsettings.The
config commandmayberuninteractivelyina gp-wlm sessionorinbatchmodeatthecommandline.ThecommandmustberunontheGreenplum
masterhost.
SeeUsingtheGreenplumWorkloadManagerCommandLinefor gp-wlm command-linesyntaxandusage.
Note
The config commandworksonlywithsettingsthatcanbechangedbyusers.
Whenviewing,describing,orsettingthevalueofaconfigurationsetting,youmustspecifyitsWorkloadManagercomponent.Acomponentcanbean
individualservice,plugin,orcommand-linetoolthatisapartoftheWorkloadManagersystem.
Ininteractivemode,youcandouble-tapthetabcharactertoseewhichcomponentsandsettingsareavailablefortheshow,describe,andmodify
commands.
ViewingConfigurationValues
Toviewthecurrentvalueofaconfigurationsettingwhileina gp-wlm session,usethefollowingsyntax:
>configshow<component><setting>
Forexample,thefollowingcommandshowstheloggingleveloftherulesengineservice:
>configshowrulesenginelogging:log_level
Fromthecommandline,usethe --config-show option:
$gp-wlm--config-show='<component><setting>'
Forexample:
$gp-wlm--config-show='rulesenginelogging:log_level'
DescribingConfigurationValues
Usethe describe commandtoseeadescriptionofasettingandconstraintsforthesetting’svalues.
Ina gp-wlm session,thesyntaxis:
>configdescribe<component><setting>
Onthe gp-wlm commandline,usethe config-describe command-lineoption:
$gp-wlm--config-describe='<component><setting>'
Forexample,todescribetheloggingleveloftherulesengineinaninteractive gp-wlm session,usethiscommand:
>configdescriberulesenginelogging:log_level
Theoutputofthecommandlookslikethefollowing:
component:rulesengine
setting:logging:log_level
description:Thelogverbosityoftherulesenginedaemon
validvalues:err,warn,info,debug,trace
©CopyrightPivotalSoftwareInc,2013-2017
32
1.8.2
Hereisthesamecommandinbatchmodeatthecommandline:
$gp-wlm--config-describe='rulesenginelogging:log_level'
ModifyingConfigurationValues
Usethe configmodify commandtochangethevalueofaWorkloadManagerconfigurationsetting.Changingaconfigurationsettingautomaticallychanges
thesettingonallhostsinthecluster.
Inaninteractive gp-wlm session,usethissyntax:
>configmodify<component><setting>=<value>
Atthecommandline,usethe gp-wlm --config-modify option,withthefollowingsyntax:
$gp-wlm--config-modify='<component><setting>=<value>'
Thenewsettingispersisted,andwillbepreservedduringfutureWorkloadManagersoftwareupgrades.
Whenasettingforaserviceismodified,theaffectedserviceisautomaticallyrestartedoneveryhostinthecluster.However,thiscanonlyoccur
automaticallyifthecfgmonserviceisrunningontheGreenplummasteratthetimethesettingischanged.Ifthecfgmonserviceisnotrunning,thesetting
isstillupdatedpersistently,butthenewvalueisnotbroadcasttotherestoftheclusteruntilthecfgmonserviceisstarted.Thecfgmonserviceisalways
running,bydefault.
ConfigurableWorkloadManagerSettings
Thefollowingtablelistssettingsthatcanbeviewed,described,andconfiguredusingthe config command.
Component
Setting
Description
Type
Constraints
Default
agent
logging:log_level
Logverbosityofagent
String
ValidValues:err,warn,info,
debug,trace
info
cfgmon
logging:log_level
Logverbosityofcfgmon
String
ValidValues:err,warn,info,
debug,trace
info
gpdb_stats
collect_frequency
HowoftentocollectGPDBstatisticsinformation
Float
Validrange:0.1-60.0seconds
1.0
publish_frequency
HowoftentopublishGPDBstatisticsinformation
Float
Validrange:0.1-60.0seconds
4.0
publish_idle_sessions
PublishinformationaboutidleGreenplumDatabase
sessions
Boolean
‘true’or'false’
'false’
engine:rule_frequency
Frequencyofruleevaluationinseconds
Float
Validrange:0.1-60.0seconds
2.0
logging:log_level
Logverbosityofrulesengine
String
ValidValues:err,warn,info,
debug,trace
info
logging:log_level
Logverbosityofsystemdataplugin
String
ValidValues:err,warn,info,
debug,trace
info
publish_idle_processes
PublishinformationaboutidleGreenplumDatabase
processes
Boolean
'true’or'false’
'false’
rulesengine
systemdata
©CopyrightPivotalSoftwareInc,2013-2017
33
1.8.2
Troubleshooting
Youmaycollectalllogsacrosstheclusterusingasinglecommand.Tocreateatarballofalllogsinthecurrentdirectory,invoke:
bin/gather-cluster-logs.sh--symlink<LN>
where LN isthepathtothe gp-wlm symboliclinktotheGreenplumWorkloadManagerinstallationdirectory.
©CopyrightPivotalSoftwareInc,2013-2017
34
1.8.2
WorkloadManagerMetricReference
ThistopicliststhemetricsGreenplumWorkloadManagercollects.Thesemetricscanbeusedinthe when clauseofaWorkloadManagerruletoselect
GreenplumDatabasequeriesthattriggeranaction.Metricsin when clausesareprefixedwiththeirscope,forexample:
host:cpu_util>35
Themetric,inthisexample,is cpu_util andthescopeis host .Thismetricwillmatchanyhostwithgreaterthan35%CPUutilization.Thefollowing
expressionmatchesasingle postgres processonanyhostusingmorethan35%CPU:
host:pid:cpu_util>35andhost:pid:name='postgres'
Metricsmayalsobelistedintheoptional including clauseofarulesothattheirvaluesaresavedwiththerecordoreventdatawhenaruleismatched.
Whenaddingmetricstothe including clause,omitthescope;WorkloadManagerfindsthemetricinthescopematchedbythe when clause.
Themetricsarearrangedinthefollowingcategories:
Connections–numberofbackendconnectionsandconnectionstothemaster
Identification–namesofusers,hosts,databases,ports,processes,andsoon
Transactions–informationaboutthecurrenttransaction,querieswithintransactions,andnumbersoftransactionscommittedandrolledbackinthe
database
Date/Time–dateandtimemetricsforahost
CPU–CPUutilizationforhosts,processes,andsessions
Memory–memoryutilizationforprocessesandqueries
Vmem-vmemutilizationforsegmentsandsessions
Spill–numberofspillfiles(workfiles)createdandtotalspillfilesizeforaquery
I/O–diskread/writestatisticsfordatabases,processes,andqueries
Skew–diskread/writeskewandmemoryskewforqueries
Connections
Scope
Metric
Datatype
Description
datid
numbackends
integer
Numberofconnectionstothisdatabase
gpdb
total_master_connections
integer
Totalnumberofconnectionstothemastersegmentacrossalldatabases
Identification
Scope
Metric
Data
type
Description
session_id:host:pid
usename
string
Nameoftheuserloggedintothisbackend
datid
datname
string
Nameofthisdatabase
host:pid
long_name
string
Bydefault,thisistheabsolutepathtotheprocessexecutable,butmaybeoverriddenbythe
processitselftostatusinformationinutilitieslikeps(1)
host:pid
name
string
Thefilenameoftheexecutable
host:pid
state
string
Kernelstateofthisprocess;seethemanpageforproc(5)formoreinformation
session_id:host:pid
application_name
string
Nameoftheapplicationthatisconnectedtothisbackend
session_id:host:pid
client_addr
string
IPaddressoftheclientconnectedtothisbackend
session_id:host:pid
client_port
integer
TCPportnumberthattheclientisusingforcommunicationwiththisbackend
session_id:host:pid
datid
integer
OIDofthedatabasethisbackendisconnectedto
session_id:host:pid
datname
string
Nameofthedatabasethisbackendisconnectedto
session_id:host:pid
gpdb_segment_role
string
ThecurrentroleofthisGreenplumDatabasesegment(MASTER,SEGMENT,MIRROR)
session_id:host:pid
usesysid
integer
OIDoftheuserloggedintothisbackend
©CopyrightPivotalSoftwareInc,2013-2017
35
1.8.2
session_id:host:pid
Scope
usesysid
Metric
integer
Data
type
OIDoftheuserloggedintothisbackend
Description
Transactions
Scope
Metric
Data
type
Description
datid
xact_commit
integer
Numberoftransactionsinthisdatabasethathavebeencommitted
datid
xact_rollback
integer
Numberoftransactionsinthisdatabasethathavebeenrolledback
session_id:host:pid
backend_start
string
Timewhenthisprocesswasstarted,i.e.,whentheclientconnectedtotheserver
session_id:host:pid
current_query
string
Textofthisbackend’scurrentquery.Example:Complexrule.
session_id:host:pid
query_start
string
Timewhenthecurrentlyactivequerywasstarted
Timeelapsedsincethequerystarted,inseconds.Thisincludesquerywaittime.Example:Cancelany
session_id:host:pid
runtime
integer
querywherethesessionhasrunlongerthan120seconds.
session_id:host:pid
xact_start
string
Timewhenthisprocess’currenttransactionwasstarted
Date/Time
Note:DateandtimevaluesarestoredinUTCstandardtimeandconvertedtothelocaltimezonefordisplay.Usethe SHOWTIMEZONE and
SETTIMEZONE commandsin psql toviewandsetthelocaltimezone.
Scope
Metric
Datatype
Description
host
day
integer
Dayas0-30
host
day_of_week
integer
Dayas0-6
host
day_of_week_string
string
Mon,Tue,…
host
month
integer
Monthas0-11
host
year
integer
Numericyear
host
hour
integer
Houras0-23
host
minute
integer
Minuteas0-59
CPU
Scope
Metric
Data
type
Description
host
node_cpu_util
float
CPUutilizationonthishostaveragedoveractiveCPUs(%).Excludesidletime.Example:RecordhighCPU
querieswhenoverallCPUutilizationonahostishigh.
host:pid
avg_cpu_util
float
CPUutilizationofthisprocessaveragedoverthelasttwopollingintervals(%).
host:pid
cpu_util
float
CPUutilizationofthisprocess(%).Example:RecordhighCPUquerieswhenoverallCPUutilizationonahost
ishigh.
session_id
cpu_skew
float
CPUutilizationskewacrossthecluster.Calculatedasthecubedstandarddeviationof
session_id:host:total_cpufromallhostsrunningacertainquery.Valuesclosertozeroindicatelessskew.
Thisisnotapercentage.Example:Complexrule.
session_id:host
total_cpu
float
TotalCPUutilizationofallprocessesrunningacertainqueryonahost(%).Example:RecordhighCPU
queriesonahostwhenoverallCPUutilizationonthathostishigh.
Memory
Scope
Metric
Datatype
Description
host
mem_avail
integer
Totalavailablememoryonthishost(free+buffers+cached)(kB)
host
mem_avail_pct
float
Availablememoryonthishostaspercentageoftotal
©CopyrightPivotalSoftwareInc,2013-2017
36
1.8.2
host
Scope
host
mem_buffers
Metric
mem_cached
integer
Datatype
integer
Memoryinbuffersonthishost(kB)
Description
Cachedmemoryonthishost(kB)
host
mem_free
integer
Freememoryonthishost(kB)
host
mem_free_pct
float
Freememoryonthishostaspercentageoftotal
host
mem_total
integer
Totalmemoryonthishost(kB)
host:pid
data_size_bytes
integer
Thesizeofdata+stackmemoryregioninthisprocess(bytes)
host:pid
dirty_size_bytes
integer
Thesizeofdirtypagesusedinthisprocess(bytes)
host:pid
library_size_bytes
integer
Thesizeoflibrarymemoryregioninthisprocess(bytes)
host:pid
program_size_bytes
integer
Thetotalprogramsize(bytes)
host:pid
program_size_pct
float
Thesizeofthisprocessasapercentageoftotalsystemmemory
host:pid
resident_size_bytes
integer
Thesizeofresidentmemoryconsumedbythisprocess(bytes)
host:pid
resident_size_pct
float
Thesizeofthisprocess’residentmemoryasapercentageoftotalsystemmemory
host:pid
shared_size_bytes
integer
Thesizeofallsharedpagesusedbythisprocess(bytes)
host:pid
text_size_bytes
integer
Thesizeofcodememoryregioninthisprocess(bytes)
session_id:host
total_resident_size_pct
float
Totalresidentmemorypercentageofallprocessesrunningacertainqueryonahost
Vmem
TousethevmemmetricsinWorkloadManager,youmustfirstrunthe gp_session_state.sql scriptincludedwithGreenplumDatabaseonthepostgres
database.Thisisaone-timetask.
Thescriptcreatestheview session_level_memory_consumption inthedatabase.SeeViewingSessionMemoryUsageInformation intheGreenplum
DatabaseAdministratorGuideforinformationaboutthisviewandthe gp_session_state.sql script.
Executethescriptwiththefollowingcommand:
psql-dpostgres-f$GPHOME/share/postgresql/contrib/gp_session_state.sql
Followingarerecommendedconfigurationadjustmentswhenusingvmemmetrics.Enterthecommandsatthe gp-wlm command-line:
configmodifygpdb_statspublish_frequency=0.75
configmodifygpdb_statscollect_frequency=0.5
configmodifyrulesengineengine:rule_frequency=0.5
Scope
Metric
Data
type
Description
host:segment_id
total_vmem_size_mb
integer
TotalvmemusageforthisGreenplumsegmentinmegabytes
host:segment_id
total_vmem_size_pct
float
TotalvmemusageforthisGreenplumsegmentasapercentageoftotal.Example:
Recordquerieswithhighvmemusagewhensegmentvmemusageishighaswell.
session_id:host:segment_id
vmem_size_mb
integer
Totalvmemusedbythesessiononthissegment
session_id:host:segment_id
vmem_size_pct
float
Thepercentageofthissegment’s gp_vmem_protect_limit consumedbythis
session.Example:Recordquerieswithhighvmemusagewhensegmentvmemusage
ishighaswell.
Spill
Scope
Metric
Data
type
Description
session_id:host:pid
spillfile_count_across_cluster
integer
Totalnumberofspillfiles(workfiles)createdforthisqueryacrossthecluster.
Example:Recordquerieswithhighspillfilecount.
session_id:host:pid
spillfile_size_across_cluster
integer
Totalsizeofspillfiles(workfiles)createdforthisqueryacrossthecluster,inbytes
©CopyrightPivotalSoftwareInc,2013-2017
37
1.8.2
I/O
Scope
Metric
Data
type
Description
datid
blks_hit
integer
NumberoftimesdiskblockswerefoundalreadyinthePostgreSQLbuffercache
datid
blks_read
integer
Numberofdiskblocksreadinthisdatabase
host:pid
disk_read_bytes
integer
Totalnumberofbytesreadfromdiskbythisprocess
host:pid
disk_read_bytes_per_sec
float
Thenumberofbytesreadfromdiskpersecondbythisprocess
host:pid
disk_write_bytes
integer
Totalnumberofbyteswrittentodiskbythisprocess.Example:Recordnumberofbytes
writtentodiskonahostbyanyqueryprocess.
host:pid
disk_write_bytes_per_sec
float
Thenumberofbyteswrittentodiskpersecondbythisprocess
host:pid
read_bytes
integer
Totalnumberofbytes(disk,network,IPC)readbythisprocess
host:pid
read_bytes_per_sec
float
Thenumberofbytesreadpersecond(disk+net+IPC)bythisprocess
host:pid
reads
integer
Totalnumberofreadsystemcallsmadebythisprocess
host:pid
reads_per_sec
float
Thenumberoftotalread(2)callspersecondbythisprocess
host:pid
write_bytes
integer
Totalnumberofbytes(disk,network,IPC)writtenbythisprocess
host:pid
write_bytes_per_sec
float
Thenumberofbyteswrittenpersecond(disk+net+IPC)bythisprocess
host:pid
writes
integer
Totalnumberofwritesystemcallsmadebythisprocess
host:pid
writes_per_sec
float
Thenumberoftotalwrite(2)callspersecondbythisprocess
session_id:host
total_disk_read_bytes_per_sec
integer
Totaldiskreadbytes-per-secondofallprocessesrunningacertainqueryonahost
session_id:host
total_disk_write_bytes_per_sec
integer
Totaldiskwritebytes-per-secondofallprocessesrunningacertainqueryonahost.
Example:Recordtotalnumberofbyteswrittentodiskperseconahostbyallquery
processes.
Skew
Scope
Metric
Data
type
Description
session_id
disk_read_bytes_per_sec_skew
float
Diskreadskewacrossthecluster.Calculatedasthecubedstandarddeviationof
session_id:host:total_disk_read_bytes_per_secfromallhostsrunningacertainquery
session_id
disk_write_bytes_per_sec_skew
float
Diskwriteskewacrossthecluster.Calculatedasthecubedstandarddeviationof
session_id:host:total_disk_write_bytes_per_secfromallhostsrunningacertainquery
session_id
resident_size_pct_skew
float
Residentmemoryutilizationskewacrossthecluster.Calculatedasthecubedstandard
deviationofsession_id:host:total_resident_size_pctfromallhostsrunningacertainquery
©CopyrightPivotalSoftwareInc,2013-2017
38
1.8.2
Download PDF
Similar pages