Using the dhcp method for pnp discovery. This has been working well, however, our current deployments depend on having 2 uplinks to our upstream switch. On the uplink switch, we use the first link as a trunk and pass specific vlans through, while the second link we have been using as an access port into our pnp specific vlan, 1400. This works no problem, but we need to remember to go back to our upstream switch and change that access port to match the 1st link and add it to the existing port-channel for redundancy. Would like to eliminate this final step and truly make this zero-touch after install. Found the PnP stacking blog posts helpful. Network Automation with Plug and Play (PnP) – Part 6.
Using a 6807 as upstream switch and 2 3650s as by deployment, used the following config and was unable to get this deployment to discover:
6807:
pnp start-up vlan 1400
int po77
description APIC-EM-TEST
switchport
switchport mode trunk
switchport trunk allowed vlan 1,1000,1400
no port-channel standalone-disable
int gi 2/46-47
description TEST-APIC-EM
switchport
switchport mode trunk
switchport trunk allowed vlan 1,1000,1400
udld port
channel-group 77 mode active
Based on the article linked above, cdp should create vlan 1400 on the deployment and have it pull from DHCP. Then it would assign all active interfaces on the stack to 1400, in this case, 1/1/1 and 2/1/1. Unfortunately, 1400 is never created on the 3650s and discovery never happens. The main 1400 vlan lives on the 6807 and has a helper that points at an MS DHCP. Option 43 lives there and has been working with our current deployment standard.
what version of software are you running on 6k? There are some issues with pnp-startup vlan on earlier versions of code.
Can you do a "show cdp tlv app" on the 6k, you should see the vlan being advertised.
- Currently running 152-1.SY1a on this 6k
And it appears that it is not advertising 1400 to the deployment stack:
Btw, sometime a "no PnP startup-vlan" followed by a reapplication of the command will help on the older code. You could try that first before upgrading.please let us know how the 15.4(1)SY code goes.
Got this semi working on accident during another deployment of another switch model. Was deploying a 3560CX and it never fully contacted the controller but when consoled into, the logs had the following:
%SW_VLAN-4-VLAN_CREATE_FAIL: Failed to create VLANs 1400: extended VLAN(s) not allowed in current VTP mode
Feb 9 14:34:41.351: %LINEPROTO-5-UPDOWN: Line protocol on Interface Vlan1400, changed state to down
Feb 9 14:34:41.369: %PM-2-VLAN_ADD: Failed to add VLAN 1400 - VTP error
Vlan 1400 is our PNP deployment vlan. So I'm not sure if it is something specific to this switch as this is the first one of this model we received or if it was because I reapplied the startup vlan and now it is "working". Either way, based on the logs, it looks like using this could be a problem. What do you think?
3560CX was on 15.2(4)E
Upstream 6K has not been upgraded yet
#1) pnp startup-vlan was not working earlier as the "show cdp tlv app" command showed that it was not sending the vlan information. The good news is that CDP is sending the correct vlan now.
#2) On a 3650, this is not a problem as it supports extended vlan. I set my startup-vlan to 1400, and saw the vlan being created on a 3650. However, as the compacts run classic IOS, I can see how 1400 would be a problem for them.
In short, you have good news on the 6k sending CDP startup vlan. It should be working on the 3650, but i can see why not on the 3560-cx (due to their architecture)
- This makes sense. I'm going to do some further testing and then see if I still need to upgrade the code on the 6K if it flakes out on me again. I'll probably end up testing the 3560CX with another vlan id to make sure it works with that as well. I found it interesting that PNP will prefer pnp startup vlan method over doing standard DHCP deployment. Because when I was trying to deploy this, I was trying to use my standard option 43 method, but pnp start-up vlan took priority.
They work together.
Startup vlan creates the vlan on the switch and then has dhcp configured on it. It also configures any active links in this vlan.
Once this happens either dhcp or DNS can be used for PnP.
I get what you are saying, but I was previously just making the upstream switch an access port into 1400 so that the deployment switch would hit dhcp and get option 43. The 1400 vlan was never created on the deployment switch. So if I wanted to use pnp startup vlan and two trunks for my 3650/3850s, I would not be able to use the same upstream switch, that is still configured for pnp startup vlan, to deploy my 3560CX because pnp startup vlan would try and start pnp before the switch gets dhcp from upstream. This would fail to create my 1400 L2, on the deployment switch, because my pnp vlan is an extended vlan ID. Is there anyway to specify particular models to use pnp startup vlan vs straight dhcp as I have done in the past? I am still going to test with pnp startup >1000 to make sure that its really working on the 6K, but I wanted other options to keep in my mind.
currently the "pnp startup-vlan" command is global on the upstream switch. We were trying to keep things simple by doing this.
I can see the challenge of a management-vlan > 1003.
A lot of people choose management/native vlan <1000 to make sure there are no issues with switches that do no support extended vlan ranges. This should be less of an issue now, but as you can see, some of the smaller devices still have restrictions.
One of the challenges with automation is that sometimes it requires a re-engineering of a process to ensure it can be better automated.
My concern with adding in different startup-vlan is that is going to create complexity as different models would have a different management vlan. Would it be simpler to standardise on a management vlan that works on all devices?
That makes sense and we are tossing around the idea of moving our deployment vlan to 44. Just need to make sure we aren't using that vlan id anywhere else.
Was able to get pnp startup vlan 44 working with the 3560cx, and I was also able to get my 3650s working with pnp start up 1400, and 44. I feel like doing the deployment like this is much faster, but I can't put my finger on why it seems that way.
Did notice that pnp startup does not clean up after itself. So vlan 44/1400 and SVI 44/1400 still remain in the final deployment. Is there another way to clean these up after deployment other than an EMM?
Also,found that when moving interfaces on my 6k to the deployment devices that I would have to reapply the pnp startup vlan because it was not advertising on those specific interfaces, presumably because they were just plugged in and configured. Is that a bug or just something weird with the code I'm still running?
startup vlan is not supposed to clean up. It is meant to leave the vlan in place, so you can manage the device via that vlan (often referred to as the 'management vlan'). It is possible to remove it, but what IP address/VLAN would you use to manage the device?
The issue you are seeing with needing to reapply is an issue that is resolved in the later version of code. Reapplying the command is a workaround.
Ok, that makes sense. So we use an SVI for vlan 1000 for our management on all our devices and with that we just set the IPs statically on that SVI. I also include this in my deployment template for PNP. The problem with using that vlan for PNP and management is that we didn't necessarily want to include that vlan in DHCP at all, but we would have to in order to do pnp with DHCP.
you have two options:
1) reserve a small part of the address space for DHCP (just to PnP), then switch to static IP when you provision. This will work fine, and is quite common.
2) Switch to another vlan (e.g. 1000) when you provision with a static IP. This is a little more tricky as you need to make sure the vlan gets created. depending on the switch platform, this may require some EEM help.
I think what we will end up doing is sticking with our current 1400 vlan and seeing how far that takes us. We already have it in place and the very small amount of 3560CX that we have can be PNP in my test environment prior to racking them. We have maybe a dozen of them.
Ran into an issue while testing this. Can no longer provision a certain switch that had been using quite a bit for testing. It was getting pnp startup and making the vlan no problem, but pnp trace shows that it connects and doesn't get anything, like I didn't have a rule in place. I double checked the serial and my rule and it all looked fine. Tried a different switch with the same configuration just different serial # and it worked fine. I did a few API calls and even though I can see the rule in the GUI, the calls return no rule for that serial #. I even tried to put the rule in with a PUT and that did not work either. Seems like there may be a limit to how many times you can use the same serial number? Or is there a way to clear previous entries for that serial #? Also tried putting a rule in a completely different project and this still did not work.
there is no limit to the number of times you can PnP a device.
How well did you clean it? Did you remove all the old certificates etc, or just "wr er".
If you remove the device from a rule, does it come up as unclaimed (which it should)?
It does not show up under unplanned devices for me to claim, but the pnp start up vlan works and from the pnp trace it looks like it hits the controller fine, but just never gets anything. I'll get the pnp trace for you to verify.
I use the following script every time I wipe a switch for testing:
configure terminal
crypto key zeroize
yes
!
end
test crypto pki trustpool reset
delete nvram:*.cer
delete stby-nvram:*.cer
write memory
write erase
delete flash:vlan.dat
erase nvram:
Reload
Just to be clear:
The API call you need to verify the rule is created is (where d168aa1a-bf61-46c9-b5d6-7ae4e27c48c8 is your projectID... yours will be different).
https://adam-iwan/api/v1/pnp-project/d168aa1a-bf61-46c9-b5d6-7ae4e27c48c8/device?offset=1&limit=10
The only time an entry is created in /pnp-device is when it is provisioned
https://adam-iwan/api/v1/pnp-device?matchDeviceState=true&offset=1&limit=10
Here is what Iwas done before and what is done again:
Then I tried to see if I could see it by serial number:
Above, both times did this, nothing would get returned when searched specifically for that serial #
However, this time, I was able to see it in the project by querying with the project ID:
However, this rule never progresses out of pending, but the ports are advertising the correct pnp startup vlan:
APP TLV (Gig 2/44), Configured tlv type: 4099, value: 1
APP TLV (Gig 2/44), Configured tlv type: 4103, value: 44
APP TLV (Gig 2/45), Configured tlv type: 4099, value: 1
APP TLV (Gig 2/45), Configured tlv type: 4103, value: 44
And the switch is seen in CDP neighbors:
Switch Gig 2/45 157 S I WS-C3650- Gig 1/1/2
Switch Gig 2/44 132 S I WS-C3650- Gig 1/1/1
VLAN 44 is created successfully on the switch and here is the pnp trace:
Switch#sh pnp trace
[02/17/17 13:30:52.100 UTC 1 399] Info: Startup config does not exists
what you are seeing with the API is correct. There is not entry in /pnp-device until the device start provisioning. You will see this also if you reset an existing rule. The output from /pnp-device will be [].
What mechanism are you using for discovery?
[02/17/17 13:32:09.493 UTC 53 399] pnpa_disc_dhcp_dns: domain name University.liberty.edu on interface Vlan44
[02/17/17 13:32:09.493 UTC 54 399] pnpa_disc_dhcp_dns:Host name address resolution failed
[02/17/17 13:32:09.493 UTC 55 399] pnpa_dns_discovery:DNS discovery not successful
02/17/17 13:33:33.584 UTC 76 399] Info: PnP profile configuration was unsuccessful
[02/17/17 13:33:33.585 UTC 77 399] start_pnpa_discovery: PnP discovery process failed
It looks like the DNS resolution failed (assuming that is how you are doing discovery)?
Hence there was no way to discover the controller.
using DHCP with option 43 and it works within the same vtp and vlan with other switches
Actually, so there is another bit of the story I believe you should know that might relate. A long time ago when we first got option 43 set up, I was finding that my deployments would connect to some external cisco site and pulling down a config that would just change the hostname to NS1. When I would do a pnp trace, it was resolving a cisco url to grab this random config. So to remedy this, I changed my DNS servers in that scope to point at the APIC-EM controller. I also did this for the tftp option within that scope. I would just delete those options from that scope, but they are globally set and therefore can not be deleted out of one scope without removing the global option all together.
For some reason the DHCP discovery is not working. What version of code is the switch running?
There is an issue in 3.7.3 and below that makes DHCP problematic with NV1.
Release Notes for Cisco Network Plug and Play, Release 1.3x - Cisco
Just to confirm this, can you upgrade this switch to 3.7.4 and see if that resolves the issue
it works with 3.7.4 and startup vlan.
The issue is specific to pnp-startup vlan.
With startup-vlan, the management interface that does DHCP is moved off vlan 1 onto the new vlan. The issue was related to the dhcp request from this new interface.
Can you confirm the device can ping the controller (some time after the initial contact). I am wondering if there is some vlan pruning/ STP blocking issue. I know you were using vlan 1400 for management vlan...
- can device ping controller after say 5mins from initial contact?
can ping the controller from the same vlan no problem, even after it appears to stall out in the GUI on getting device info.
notice this in console though:
%Error opening tftp://10.255.72.116/network-confg (Timed out)
Redundant RPs - Simultaneous configs not allowed:locked from console
Redundant RPs - Simultaneous configs not allowed:locked from console
%Error opening tftp://10.255.72.116/network-confg (Timed out)
It goes back to me setting my scope options for DNS and tftp to the controller IP, otherwise the global options would grab some sort of cisco config from an external connection. The Redundant RPs, Noticed that this message would log after the initial connection to the controller and it would then grab its config from the controller no problem. Now, it is logging this message, but then continues to try and get a config from tftp.
Comments
0 comments
Please sign in to leave a comment.